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Synopsis 


Mentoring is one of the most commonly used interventions to prevent, divert, and 
remediate youth engaged in, or thought to be at risk for delinguent behavior, school 
failure, aggression, or other antisocial behavior. In this update we report on a meta- 
analytic review of selective and indicated mentoring interventions that have been 
evaluated for their effects on delinguency outcomes for youth (e.g, arrest or 
conviction as a delinguent, self-reported involvement) and key associated outcomes 
(aggression, drug use, academic functioning). Of 164 identified studies published 
between 1970 and 20 11, 46 met criteria for inclusion. Mean effects sizes were 
significant and positive for delinguency and academic functioning with trends 
(marginal significance level) for aggression and drug use. Effect sizes were modest 
by Cohen's differentiation. However, there was heterogeneity in effect sizes across 
studies for each outcome. The obtained patterns of effects suggest mentoring may be 
valuable for those at-risk or already involved in delinguency and for associated 
outcomes. Comparison of study design (RCT vs. CE) did not show significant 
differences in effects. Moderator analysis showed larger effects when professional 
development was the motivation of the mentors for involvement, but not for basis of 
inclusion of participants (environmental vs. person basis of risk), presence of other 
interventions, or assessment of guality of fidelity. We also undertook the first 
systematic evaluation of key processes that seem to define how mentoring may aid 
youth (e.g. identification/ modeling, teaching, emotional support, advocacy) to see if 
these related to effects. Based on studies we could code for the presence or absence 
of each as part of the program effort, analyses found stronger effects when emotional 
support and advocacy were emphasized. These results suggest mentoring is as 
effective for high-risk youth in relation to delinguency as many other preventive and 
treatment approaches and that emphasis on some theorized key processes may be 
more valuable than others. However, the collected set of studies is less informative 
than expected with guite limited specification about what comprised the mentoring 
program and implementation features. The juxtaposition of popular interest in 
mentoring and empirical evidence of benefits with the limited reporting of 
important features of the interventions is seen highlights the importance of more 
careful and extensive evaluations. Including features to understand testing of 
selection basis, program organization and features, implementation variations, and 
theorized processes for effects will greatly improve understanding of this 
intervention. All are essential to guide effective practice of this popular and very 
promising approach. 
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Abstract 


BACKGROUND 


Mentoring has drawn substantial interest from policymakers, intervention theorists, 
and those interested in identifying promising and useful evidence- based approaches 
to interventions for criminal justice and child welfare outcomes (Grossman & 
Tierney, 1998; J eldiek et al., 2002). Mentoring is one of the most commonly-used 
interventions to prevent, divert, and remediate youth engaged in, or thought to be at 
risk for, delinguent behavior, school failure, aggression, or other antisocial behavior 
(DuBois, Holloway, Valentine, & Cooper, 2002, DuBois, Portillo, Rhodes, 
Shverthom, & Valentine, 2011). One account lists over 5000 organizations within 
the United States that use mentoring to promote youth wellbeing and reduce risk 
(MENTOR/ National Mentoring Partnership, 2006) . Definitions of mentoring vary, 
but there are common elements. For the purpose of this review, mentoring was 
defined by the following 4 characteristics: 1) interaction between two individuals 
over an extended period of time, 2) ineguality of experience, knovdedge, or power 
between the mentor and mentee (recipient), with the mentor possessing the greater 
share, 3) the mentee is in a position to imitate and benefit from the knovdedge, sldll, 
ability, or experience of the mentor, 4) the absence of the role ineguality that typifies 
other helping relationships and is marked by professional training, certification, or 
predetermined status differences such as parent- child or teacher- student 
relationships. A total of 46 topic and methodologically eligible studies (out of 164 
outcome reports) were identified for inclusion in the meta- analysis on delinguency 
and outcomes associated to delinguency: aggression, drug use, and academic 
achievement. 


OBJECTIVES 


This systematic review had the following objectives: 

a) To statistically characterize the evidence to date on the effects of mentoring 
interventions (selective and indicated) for delinguency (e.g. arrest, reported 
delinguency), and related problems of aggression drug use, school failure. 

b) To attempt to clarify the variation in effects of mentoring related to program 
organization and delivery, study methodology, and participant characteristics. 
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c) To help define mentoring in a more systematic fashion than has occurred to date 
to, in turn, help clarify how intervention processes suggested as compromising 
how mentoring has effects and other important considerations for future 
research.. 

d) To inform policy about the value of mentoring and the key features for utility. 


SEARCH STRATEGY 


This is an update of a review completed 4 years ago. In the original review search we 
benefitted from the authors of three meta- analyses on mentoring or related topics 
( 1) DuBois et al. (2002) on mentoring in general, 2) Lipsey and Wilson ( 1998) on 
delinguency interventions in general, and 3) Aos et al. (2004) on interventions for 
delinguency and associated social problems) vdio provided databases on reports and 
coding approaches. In addition, we searched various databases including 
PsychINFO, Criminal Justice Abstracts, Criminal Justice Periodicals Index, Social 
Sciences Qtation Index (SSQ), Science Qtation Index (SQ), Applied Social Sciences 
Indexes and Abstracts (ASSIA), MEDLINE, Science Direct, Sociological Abstracts, 
Dissertation Abstracts, Database of Abstracts of Reviews of Effectiveness, and ERIC 
(Education Resources Information Center) and the Social, Psychological, 

Educational and Criminological Trials Register (SPECTR- in original search), the 
National Research Register (NRR, research in progress), and SIGLE (System for 
Information on Grey Literature in Europe) . Einally, the reference lists of primary- 
studies and reviews in studies identified from the search of electronic resources were 
scaimed for any not-yet identified studies that were relevant to the systematic 
review. Eor this update we searched the same databases (except SPECTR as it no 
longer existed), surveyed pertinent journals and the reference lists of primary- 
studies and reviews. 


SELECTION CRITERIA 


1. Studies that focused on youth vdio were at risk for juvenile delinguency or who 
were currently involved in delinguent behavior. Risk is defined as the presence 
of indi-vidual or ecological characteristics that increase the probability of 
delinguency in later adolescence or adulthood. 

2 . We included interventions focusing on prevention for those at- risk ( selective 
interventions) and treatment (indicated interventions) that included mentoring 
as the intervention or one component of the intervention and at least measured 
impact of the program. We excluded studies in which the intervention was 
explicitly psychotherapeutic, behavior modification, or cognitive behavioral 
training and indicated provision of helping services as part of a professional role. 

3. We reguired studies to measure at least one guantitative effect on one of the four 
outcomes (delinguency, aggression, substance use, academic achievement) in a 
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ODmparison of mentoring to a control condition. Experimental and high guality 
guasi- experimental designs were included. 

4. The review was limited to studies conducted within the United States or another 
predominately English-speaking country and reported in English and to studies 
reported between 1970 and 20 11. We did not have resources for translating 
reports not reported in En^ish. 


DATA COLLECTION AND ANALYSIS 


All eligible studies were coded using a protocol derived from three related prior 
meta- analyses, with 20% double-coded. The intervention effect for each outcome 
was standardized using well established methods to calculate an effect size with 95% 
confidence intervals for each of the four outcomes (if included in that study) : 
delinguency, aggression, drug use and academic achievement. Meta- analyses were 
then conducted for each independent study within a given outcome (delinguency, 
aggression, drug use, and academic achievement). Effect sizes for each study were 
scaled so that a positive effect indicated a desirable outcome (i.e., lower delinguency, 
drug use, and aggression or higher academic achievement). 


MAIN RESULTS 


A total of 164 studies were identified as meeting inclusion criteria as focused on 
delinguency and mentoring. Of these, 46 met the additional criteria for inclusion in 
the guantitative analyses. 27 were randomized controlled trials and 19 were guasi- 
experimental studies involving non- random assignment, but with matched 
comparison groups as was described above. Twenty- five studies reported 
delinguency outcomes, 25 reported academic achievement outcomes, 6 reported 
drug use outcomes, and 7 reported aggression outcomes. 

Main effects sizes were positive and statistically significant for all four outcomes. 
Some studies showed effects that were not significant and a few reported negative 
effects. Eor each outcome there was substantial variation in effect size, too. Average 
effects were larger for delinguency than for other outcomes. When moderation was 
tested, there was considerable variation in effect sizes of studies that were similar in 
regard to the presence of a given moderating influence. 

We compared effect sizes of those studies that were random assignment 
experimental designs with those that were guasi- experimental using meta- 
regression and found no evidence of differences in effect sizes. We conducted 
moderator analyses to determine whether effects found differed by 1) criteria for 
selecting participants, 2) presence of other components along with the mentoring 
intervention, 3) motivation of mentors for participation, or 4) assessment of guality 
or fidelity of implementation of the intervention. We also conducted moderator 
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analyses to test for outcome differences by the presence or absence of four theorized 
key components of mentoring interventions. The relatively limited information 
about potential moderating characteristics extractable from many reports and the 
limited number of reports with extractable information led us to combine effects 
across all four outcomes to enable adeguate power and in combination to our 
directional expectations for moderators to test significance using a one-tailed test (p 
< . 0 5) . For these analyses, we averaged effect sizes within a given study if more 
than one outcome of interest was reported. We also conducted analyses to check for 
bias in effects due to type of outcome, and found no suggestion of bias. 

We found evidence for moderation when professional development was a motive for 
becoming a mentor. There was also moderation of the effect size when mentoring 
programs emphasized either of two theorized components: emotional support or 
advocacy. Effect sizes did not differ by whether or not the program emphasized the 
other two key components: modeling/ identification or teaching, nor by whether 
other components were used, how risk was defined (environmental versus 
individual characteristics) or if fidelity/ adherence of implementation features were 
assessed. 


REVIEWERS' CONCLUSIONS 


This analysis of 46 studies on four outcomes measuring delinguency or closely 
related outcomes of aggression, drug use, and academic functioning suggests 
mentoring for high-risk youth has a modest positive effect for delinguency and 
academic functioning, with trends suggesting similar benefits for aggression and 
drug use. Effect sizes varied more for delinguency and academic achievement than 
for aggression and drug use. We did not find a significant difference in effect size by 
study design (RA vs. QE) or by whether or not fidelity was assessed. We identified 
some characteristics that moderated effects that provide additional understanding 
for further studies and program design. Effects tended to be stronger when 
professional development was an explicit motive for participation of the mentors. Of 
four processes theorized as comprising the methods of effects in mentoring, we 
found evidence for significantly larger effects when emotional support and advocacy 
were emphasized. Although these findings support viewing mentoring as a useful 
approach for intervention to lessen delinguency risk or involvement, limited 
description of content of mentoring programs and substantial variation in what is 
included as part of mentoring efforts detracts from better understanding about what 
might account for the benefits. The valuable features and most promising 
approaches cannot be ascertained with any certainty. In fact, the body of studies is 
remarkably lacking in description of key features, program design organization, and 
theorized processes of impact that are typically provided in empirical reports of 
intervention effects. Our judgment is also that there does not seem to be much 
progression in guality of details in reports over the time period studied here. Given 
the popularity of this approach, the promise of benefits should be seen as a strong 
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argument for a ODncerted effort through quality randomized trials to specify the 
theoretical and practical components for effective mentoring with high-risk youth. 
COncordantly, lacking such features, further trials may not add useful knoMedge. 
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1 Background 


Mentoring is one of the most commonly- used interventions to prevent, divert, and 
remediate youth engaged in, or thought to be at risk for, delinguent behavior, school 
failure, aggression, or other antisocial behavior (DuBois, Holloway, Valentine, & 
Cooper, 2002) . It is the centerpiece of the work of the Boys and Girls Clubs of 
America. A recent account lists over 5000 organizations within the United States 
that use mentoring to promote youth wellbeing and reduce risk (MENTOR/ National 
Mentoring Partnership, 2006) . In fiscal year 20 11 it is estimated approximately 
$ 100 million in federal support and research funds were dedicated to youth 
mentoring (DuBois et al., 2011). 

Definitions of mentoring vary, but there are common elements that can be identified 
across definitions (DuBois &Karcher, 2005, DuBois, et al., 2011). Most commonly 
the central feature is a one-on-one relationship between a provider (mentor) and a 
recipient (mentee) for the potential of benefit for the mentee. For the purpose of 
this review, mentoring will be defined by the following 4 characteristics: 1) 
interaction between two individuals over an extended period of time, 2) ineguality of 
experience, knoMedge, or power between the mentor and mentee (recipient), with 
the mentor possessing the greater share, 3) the mentee is in a position to imitate and 
benefit from the knoMedge, sldll, ability, or experience of the mentor, 4) absence of 
role ineguality between provider and recipient that typifies most helping or 
intervention relationships whether based in professional training or certification of 
the provider or as occurs inherent in parent- child, teacher- student, or other 
professional- client relationships. Thus, mentoring differs from professional- client 
relationships such as counseling or therapy, and from parenting or formal 
educational relationships. 

When applied to delinguency and other similar outcomes, mentoring usually 
involves older, usually adult, persons in the community who provide opportunities 
for imitation, gaining advice, pleasurable recreational activities that show care and 
interest in the mentee, and emotional support, information, and advocacy through a 
one-to-one relationship. Such opportunities are thought to foster healthy 
development and diversion from risk- elevating activities and attitudes (J ekielek, 
Moore, Hair, &Scarupa, 2002; Rhodes, Spencer, Keller, Lian, &Noam, 2006). 
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Over the past twenty years, mentoring has drawn substantial interest from 
policymakers, intervention theorists, and those interested in identifying promising 
and useful evidence- based approaches to interventions for criminal justice and child 
welfare outcomes (Grossman &Tiemey, 1998; J ekielek et al., 2002). This has 
included a substantial investment in the United States and elsewhere in support for 
implementation of mentoring, a professional organization dedicated to advancing 
guality of and use of mentoring, and a proliferation of mentoring programs 
(MENTOR/ National Mentoring Partnership, 2006) . The popularity and extensive 
anecdotal praise for mentoring makes it important to have soimd, evidence- based, 
understanding of its promise. While prior meta- analyses and moderation tests of 
specific interventions can point to some potentially important features, none of 
these analyses have focused on mentoring as an intervention for youth at risk for 
delinguency. In this study, we conduct a meta-analytic review of mentoring 
interventions that have been evaluated for their effects on delinguency (e.g, arrest 
or conviction as a delinguent, self-reported involvement) and three outcomes 
(aggression, drug use, academic achievement) that often oo- occur with delinguency, 
share risk factors, are often also targets of delinguency interventions and show 
effects from such efforts (Tolan, 2002). 

Unlike many types of intervention, there are a substantial number of studies that 
evaluate the effects of some form of youth mentoring (DuBois et al., 20 11; Rhodes, 
Bogat, Roffman, Edelmena, &Galasso, 2002). Critical reviews have focused on the 
potential benefits of mentoring and characteristics that might be associated with 
positive effects from mentoring (Hall, 2003; Rhodes, 2002). More recently, several 
meta- analyses have considered mentoring programs in relation to youth risk, 
including delinguency (Aos, Lieb, Mayfield, Mill er, &Pennucd, 2004, DuBois et al., 
2002; DuBois et al., 2011; Lipsey & Wilson, 1998). Thus, unlike some areas of 
intervention for delinciuency and related problems, the accumulated literature on 
mentoring is substantial and has had conceptual and statistical scrutiny. None of 
the meta- analyses to date correspond exactly with the focus of the present review, 
but they were very helpful in planning this review. They suggest standards against 
which to evaluate the completeness of study inclusion and choices about coding and 
methodological recjuirements. 

Many of the conceptual reviews have been focused on the potential of mentoring as a 
general approach to youth development promotion and to reduce risk among high- 
risk populations (J ekielek et al., 2002; Rhodes, 2002) . The meta-analysis by DuBois 
et al. (2002) focused on mentoring efforts related to youth development. Although 
there was differentiation of "problem- behavior" from other outcomes (e.g. 
educational attainment, vocational) there was not clear emphasis on delinguency 
indicators as a separate area. The follow-up/ updating to that review in 20 11 (DuBois 
et al., 2011) utilized a similar general outcome category (conduct problems). 

Notably, DuBois et al (20 11) report that effect sizes were larger for programs serving 
youth involved in problem- behavior than for those with other bases for inclusion 
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and also for those with higher levels of environmental or individual risk Lipsey and 
Wilson ( 1998) organized their review around an interest in serious juvenile 
offenders. Therefore, inclusion was not about delinguency risk in general, 
precursors such as aggression level, or related outcomes such as substance use or 
academic functioning. Also, the interventions were coded in such a way that 
interventions that included mentoring in an array of interventions could not be 
distinguished from those that focused primarily or exclusively on mentoring. 
Mentoring was denoted by its mention in the description of a study, but often was 
considered as one member of a class of interventions with similar features. Aos et al. 
(2004) undertook their meta-analysis to inform a state legislature about the 
potential impact, costs and benefits of many empirically tested interventions for 
delinguency and other youth problems such as early pregnancy. Thus, their 
emphasis was on specific programs rather than mentoring as a general approach. 
Moreover, that review only examined the relative effect sizes in relation to costs and 
potential cost savings rather than the usual focus on methodological issues and other 
moderators of effects. In addition, they were interested in programs with a high 
level of empirical support for effects, so that their inclusion criteria were more 
restrictive than was used here. 

The aforementioned conceptual and statistical reviews provided excellent 
perspectives on mentoring evaluation and valuable benchmarks for guiding this 
review. In addition, they provided strong data bases from which to organize this 
review. Because of the generous sharing of information about content and methods 
by these reviewers (including access to their databases in some cases), this review 
was able to build efficiently from their prior efforts in determining coding. These 
prior reviews also helped reduce worry about file drawer and grey material that 
might be important to consider but not found without thorough searching. Of 
course, we conducted an independent search to verify the applicable literature, 
published or not. 

The accumulated reviews and the variations in the studies they included also point 
to the value of this review. Each suggested mentoring programs could have 
important effects on delinguency and related outcomes. In the DuBois et al. review 
(2002), the overall category of problem behavior, which includes delinguency, had 
the largest effect sizes of any outcome category. This was confirmed in the updating 
of that review in the sense that average effect was ecjual to or close to each of the 
other categories considered (mean d = 21; DuBois et al., 2011). However, that 
review focused on mentoring in general and effects irrespective of sample selection 
basis. That review and others noted the variation in effects even among well- 
designed and completed studies; variation that imdercut confidence in the mean 
effect findings. In fact, the field is marked by mixed results (significant positive and 
negative effects) amongthemethodologically stronger studies (e.g. McCord, 1992; 
O'Donnell, Lydgate, &Fo, 1979). As the DuBois et al (2002, 2011) reviews excluded 
the McCord Cambridge- Somerville and another major mentoring study, the 
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Diversion Project of Davidson and colleagues (Davidson &Redner, 1988), the 
implications for mentoring as a criminal justice intervention are not clear. Both are 
studies that carried likely substantial impact on overall effect estimates and for 
design impact or moderator interpretation. Also, a scan of the literature at the 
outset of the review showed several new pertinent studies since the prior reviews. 
The present study focuses on mentoring as focused on youth at-risk for dehnquency, 
a more specific population focus than prior reviews of mentoring. Finally, with this 
updating this review includes studies not considered by the other mentoring 
reviews. 

Understanding Mentoring Effects 

While a relatively large number of studies with some minimal evaluation design 
features have been found and utilized in prior evaluations of mentoring, there are 
characteristics of this field that have limited how informative reviews and meta- 
analyses have been. For example, the most often considered intervention feature is 
the extent of matching on similarity of demographic characteristics and interests of 
the mentor and recipient and that the quality of the personal relationship not be 
marked by dissatisfaction (referred to as mentor relationship quality; see Rhodes, 
2005; DuBoisetal., 2011). Yet, these are merely post- hoc identified variables 
differentiating effect sizes. They do not directly or indirectly indicate vdiat are the 
processes through which mentoring has its effects or suggest what it is about 
mentoring that might make it different from many other helping relationships 
( MENTOR/ National Mentoring Partnership, 2009). A maj or limitation of the field, 
and perhaps of progress in understanding more about how and why mentoring 
shows positive effects is the lack of specificity in describing the activities comprising 
a given mentoring intervention and perhaps more importantly tying activities and 
practices to theorized key processes through which positive impact is thought to 
occur. This limits ability to compare program features that may explain variations in 
effects as well as limiting ability to tie programs to theories about interpersonal 
processes that could explain how mentoring has impact (DuBois et al., 20 11; Rhodes, 
Grossman, &Resch, 2000). Amongthe reports and reviews there is considerable 
variation in what activities are considered mentoring essentials and which are 
optional (DuBois &Karcher, 2005; (MENTOR/ National Mentoring Partnership, 
2009) . At the same time there is not much attention to and little certainty about 
what constitutes a mentoring intervention and might distinguish mentoring from 
other helping relationships (Rhodes etal., 2000). In addition, more understanding 
of key processes of a given program could improve ability to compare across 
programs and cumulatively point toward the important or necessary components 
that define a program as mentoring (Roberts, Liabo, Lucas, DuBois, & Sheldon, 
2004). 

Limited intervention description may be because mentoring arose as a voluntary and 
"indigenous" approach to youth intervention. Thus, many mentoring efforts arose 
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within a given setting without intention to formalize and standardize performance 
and activities. The practitioners who developed their particular approach may have 
had less training in and interest in formal evaluation features. As a field of 
intervention services and as a research focus, there appears to be mixed interest in 
facilitating more formal operations that will yield more informative and comparable 
results (MENTOR/ National Mentoring Partnership, 2006). Also, because one 
common basis for mentoring is a view that the positive influence of an interested 
person providing a supportive relationship is what is helping, there may well be less 
interest in trying to specify what activities and processes constitute mentoring and 
what among these could explain any benefits derived. For all of these reasons 
formal ize d protocols and systematic training approaches may not have been a 
priority. Consequently the body of research is remarkable in the limited emphasis 
on systematic description of intervention content, description of intended processes 
through which effects are expected, and in important features of implementation 
and providers. There seems to remain limited valuing of and perhaps even some 
reluctance to aim for continuity across the field or specificity in applying and 
describing mentoring efforts that might facilitate scientific understanding of effects. 
Hence, there are few training, implementation, and dosage parameters that can be 
identified as having consensus. There are few indications of what is considered 
essential or critical for mentoring and helpful in distinguishing mentoring from 
other helping relationships and approaches. Similarly, the reports reviewed here 
continue an unfortunate tradition of having limited information by intervention 
science standards and are less informative than needed about what may account for 
benefits accrued. Overall, greater interest in relating to a common set of principles, 
theorized processes, or requisite structures and components would seem an advance 
that could serve the field well. Thus, one of the goals of this effort and one that is a 
different emphasis than prior meta- analyses was to code, to the extent possible, 
comprising activities, mentor selection/ motivation, and training or implementation 
features as viewed from an intervention research lens. A second was to theorize four 
processes that are often mentioned or pointed to as how mentoring affects youth 
positively and that in total can distinguish mentoring from other forms of helping 
interventions. Of interest was that this combination do not as a group also from the 
critical processes of other helping interventions such as teaching or psychotherapy. 
Emphases on these two interests (intervention features and key processes) could 
potentially help advance understanding of mentoring, effects found, and potential 
for further study and use. 

Important Intervention Features Affecting Mentoring Impact 

One aspect of mentoring intervention characteristics given substantial attention is 
the implication that a strong personal relationship between the mentor and mentee 
is a key to any benefits derived (DuBois et al., 2002; Rhodes et al., 2006). Thus one 
advance in the field is to assess how positive and engaging the relationship is 
between the mentor and mentee (Rhodes et al., 2006) . For example, DuBois et al. 
(20 11) report larger effect sizes when matching of mentor and youth was based on 
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shared interests; presumably this improves the likelihood of a good relationship. In 
most cases, a corollary is that the mentor is undertaking this activity, not as a 
professional in the helping or social service professions, but because of personal 
interest or sense of duty, often as a volimteer (Rhodes, 2002) . When a person with 
professional backgroimd or duties to provide such services offers mentoring, the 
emphasis is more on the relationship and the personal interest in the mentee than 
on specific skills, activities, or formal protocols. Thus, it has been noted that one 
limitation of mentoring may be that providers may be less accountable as they are 
volimteers and/ or may not be well prepared for challenges of developing and 
maintaining a relationship with sometimes challenging and less appreciative youth 
(Grossman &Tiemey, 1998) . In contrast, it may be that motivation that is not 
personal, that is for professional advancement or as a paid position might be 
expected to lessen the personal commitment and connection thought to spark 
effective mentoring. More understanding of how different reasons for imdertaking 
mentoring influence effects would help with understanding effect variations and 
provide direction for improving impact. We test for differences by motivation of 
mentor for engaging in this work. 


A second area of some importance in understanding how to direct mentoring efforts 
is the effect of structuring of the effort and expecting fidelity to an approach. While 
it is increasingly recognized that training in skills and expectations are important for 
mentoring, there is much less clarity about what is important to expect. Mentoring 
has been characterized as growing out of a mentor's commitment to youth (Rhodes, 

2002) with the accompanying implication that structuring the activities and 
processes to be ensured would detract from the individualistic authentic 
engagement that carries the benefits. In contrast, research on other forms of 
intervention have not supported such a view, pointing to more clear expectations 
and fidelity prescriptions as promoting larger effects (Tolan & Gorman- Smith, 

2003) . Thus, the extent to vdiich there is emphasis on following these procedures 
and principles thought to be helpful should relate to effect levels. Therefore in this 
review we examine if assessment of fidelity relates to effect size. 

A third question of importance about mentoring is the relative value of mentoring as 
a high-risk selective and/ or indicated approach rather than as a universal 
intervention (Tolan & Guerra, 1994). Mentoring studies have been applied to high- 
risk, identified, and general populations of youth. There is some indication the 
effects might be greater for higher risk youth, although the results are not fully 
consistent (DuBois et al., 20 11) . Some have argued that mentoring represents an 
alternative view about youth risk, a focus on promoting healthy or positive 
development through strengthening abilities rather than mini mizin g exposure to 
risk or remediation of undesirable behavior and characteristics (J ekielek, et al, 

20 0 2) . Also, it may be that mentoring that is developed for and applied to high- risk 
youth has impact for that population that programs for non- delinquent or general 
population youth do not. There is evidence that preventive effects for high-risk 
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youth may be quite different from those accrued for the general population (Tolan & 
Gorman- Smith, 2003) . For example, it may be that mentoring is not valuable in 
affecting delinquency or related outcomes of high risk youth because it is not 
structured enough and focused on multiple risk factors thought to drive that 
behavior (Lipsey & Wilson, 1998) . Thus, there is a policy interest in whether 
targeting high- risk youth (selective inclusion) is useful. Therefore, the review 
imdertaken here was focused on youth defined as high-risk for or already engaged in 
delinquency (Tolan & Gorman- Smith, 2003). 

Similarly, as others have noted it is common for mentoring to occur as part of a 
multi- component program, whether as one of several components or as a central 
focus augmented by additional supporting activities (Aos et al., 2004) . This leaves 
open an important cjuestion of the extent to which effects attributed to mentoring 
might actually be coincidental inclusion with other effective components. It also 
leaves undifferentiated to vdiat extent it matters if the delivery with other 
components is simply as one of a set of substantial program features or if the 
program is primarily mentoring with some augmentation to help support and 
enhance the mentoring impact. These (Questions of interest suggest coding of these 
features, where discernible, might improve understanding of the value of mentoring. 


Identifying Potential Key Processes Defining and Differentiating 
Mentoring 

In addition to these features of intervention organization that have been of interest 
in characterizing mentoring as a field of intervention and in furthering the 
evaluation knovdedge about mentoring, there is an important but almost imattended 
to issue of what processes are typical of and constitute mentoring as an intervention. 
Are there activities or imderlying purposes of activities that are common to 
mentoring or that might vary across mentoring programs and in doing so help 
accoimt for differences in effects? As noted above, theoretical summaries of the field 
and attempts to relate mentoring to prevention science, developmental 
psychopathology, and/or youth development literature in general have suggested 
some likely key features of mentoring (Lipsey &Wilson, 1998; McGord, 1992; Tolan 
& Guerra, 1994) . These processes are differentiated from the attention to the 
mentor- mentee relationship that has dominated evaluation of mentoring (Rhodes, 
2005, DuBois & Karcher, 2005). The latter represents an aspect of connection that 
while important as a basis for mentoring is a common basis for any influence 
relationship. 

Through systematic review of theoretical organization of process models of 
mentoring (e.g, (MENTOR/ National Mentoring Partnership, 2009; Rhodes, 2005), 
indices utilized by DuBois et al. 2002 in examining how effect size of mentoring 
related to score on best practices index, components described in programs with 
significant effects (e.g., Davidson &Redner, 1988), and qualitative analyses of 
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mentoring relationships (e.g. Deutsch & Spencer, 2009) we organized a set of 
processes that seemed to occur across mentoring programs, vdiether explicitly 
described or implicit in the activities utilized. In addition, we compared mentoring 
to other helping interventions to identify distinguishing features. For example, 
mentoring is distinguished from psychotherapy by the non- professional relationship 
and the lack of emphasis on mental health problem alleviating. From these multiple 
bases we identified four processes as central to mentoring: 1) identification of the 
recipient with the mentor that helps with motivation, behavior, and bonding to 
conventions; 2) provision of information or teaching that might aid the recipient in 
managing social, educational, legal, family, and peer challenges; 3) advocacy for the 
recipient in various systems and settings; and 4) emotional support and friendliness 
to promote self-efficacy, confidence, and sense of mattering (DuBois et al., 2002; 
DuBoisetal., 2011; Rhodes etal., 2002). These processes are freguently mentioned 
individually as potential bases for mentoring benefits. More recently some attention 
has been given to how advocacy within mentoring can affect impact. DuBois et al. 
(20 11) report ^^iien advocacy was considered a mentor function effect sizes averaged 
.07 standard deviation units larger than when not. Also, several of the more fully 
described efforts point to one or more of these processes as intended elements of the 
mentoring. Understanding of whether emphasis on such processes relates to effects 
is one intended contribution of this review. Therefore, we coded studies for evidence 
of each key process driving or comprising the intervention to permit examining how 
their inclusion may have affected outcome. 


Prior Evaluation of Features Affecting Mentoring Impact 

DuBois et al. (2002, 20 11) recognized many of the issues related to advancing and 
deepening imderstanding of mentoring effects and incorporated coding of several of 
these features into their meta- analysis. In DuBois et al. (2002), they denoted an 
index of what could be considered best practices in youth mentoring based on 
recommendations of prior reviews and recommendations for establishing effective 
mentoring programs, such as the National Mentoring Working Group ( 1991) and 
coded to the extent possible from source data, each intervention report (DuBois et 
al., 2002) . They included 11 program features to mark how methodic inclusion in 
the program was, whether mentors and mentees are matched on demographic 
characteristics, how structured or prescribed activities were, and the freguency or 
extent of contact. These codes were then amalgamated into an overall index of 
extent of desired features. While this represents an informative advance about how 
the extent of features considered useful for good mentoring related to effect size, 
because it is a single score across many areas it cannot indicate the importance of 
specific features. Also, it may have obscured how many of the reports did not have 
adeguate reporting to fully assess the 11 features. 

We attempt to build on efforts of DuBois et al. (2002) to code theoretically and 
empirically linked valued characteristics, activities, and organization by focusing on 
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the moderating effects of each of several key features related to 1) selectivity in 
inclusion (high risk versus universal or no selectivity within the population); 2) 
explicit attention to presence of four key processes such as modeling, emotional 
support, advocacy, and teaching; 3) ^^hether or not mentoring is a stand-alone 
approach in that study or was undertaken along with some other components: 4) the 
motivation of the mentors in participating; and 5) the extent to vdiich guality of 
work and fidelity were assessed or emphasized. This coding was considered useful 
for suggesting what might differentiate mentoring from other similarly intended 
youth interventions. Despite prior identification of specificity of such features as a 
major limitation of the mentoring literature (Tolan &Guerra, 1996), we did not find 
much improvement over time in the ability to determine details needed to code 
many of these features for this review. We had to limit our analyses to those features 
that could be coded for enough studies to enable some useful comparison. 
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2 Objectives of the Review 


This updating of a prior systematic review had the following objectives: 

1. To statistically characterize the evidence to date on the effects of mentoring 
interventions (preventive and treatment) for delinguency (e.g. official records 
and self-reported), and the associated problems of aggression drug use, and 
school failure. 

2 . To examine the heterogeneity of effects for each outcome and the role of design 
(RA vs. QE) in the effects found. 

3. To examine the relation of a few key aspects of mentoring interventions (e.g. 
selection vs. universal inclusion, mentor motivation, guality and fidelity control, 
presence of important features of mentoring, and presence of other 
interventions) to effects found. 

4. To suggest important features of existing literature to be further developed and 
supported to improve how informative evaluations can be and to increase 
comparability across mentoring efforts. 

5. To identify gaps in this research area and make recommendations for further 
research. 

6. To inform policy about the value of mentoring and the key features for utility. 
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3 Methods 


In order to provide a review that is as free of bias as possible, we adopted a 
systematic review strategy for the research on the effects of mentoring interventions 
as guided by Campbell Collaboration standards and employed in the original review 
This report is an update of the prior Campbell Review that covered reports between 
1970 and 2005 (www.campbellcollaboration.org/lib/download/238/). This updates 
the review for reports available through J uly of 20 11. 


Search strategy for identifying relevant studies 

Three authors have conducted prior meta- analyses on mentoring or related topics: 1) 
DuBois et al. (2002) on mentoring in general, 2) Lipsey and Wilson ( 1998) on 
delinquency interventions in general, and 3) Aos et al. (2004) on interventions for 
delinquency and associated social problems. Prior to conducting this review, each of 
these authors allowed us access to some of the materials used in their analyses. Drs. 
Lipsey and Aos and their colleagues released the actual databases used for their 
analysis. We foimd that one or more of these authors had already located many of 
the studies to be included in this analysis. However, we conducted our own review 
to locate studies done since these earlier reviews were completed and to locate other 
studies, including those that were unpublished at the time of these previous 
analyses. During the search phase, abstracts were reviewed and studies that did not 
include the target outcomes or were clearly not of experimental/ quasi- experimental 
design were excluded from further consideration. Full-text copies of the remaining 
164 studies were then obtained. We used dates, sample sizes, authorship, and 
information provided on studies to determine vdiether two effects on the same 
outcome came from the same study. We did not count effect sizes at different 
follow-up points as independent effects, using the effect most dose to post- test for 
these analyses. 

Search terms and databases 

We based our search terms on those used by prior meta- analyses. We used a 
combination of terms in searching electronic databases and research registers. Table 
1 shows the search terms used, although slight deviations in key words (induding 
derivative forms of the listed terms) required modification to achieve equivalent 
searches in some databases (e.g., choosing a broader search term vdien a narrower 
term was not supported in the database) . We also provide details of combinations of 
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the search terms and some examples of resulting search combinations (shown in the 
inner cells) in Table 2. We searched the databases using combinations of terms, each 
of vdiich contained: 1) one of four outcomes (and derivative forms of these terms) : 
delinguency, aggression, substance use, or academic achievement; 2) a cognate of 
mentoring; and 3) a cognate of intervention 

Databases searched 

Databases were selected based on their potential relevance to the topic and to the 
outcomes of delinguency, academic achievement, aggression, and substance use 
more generally. The databases searched included PsychINFO, Criminal J ustice 
Abstracts, Criminal J ustice Periodicals Index, Social Sciences Qtation Index (SSQ), 
Science Qtation Index (SCI), Applied Social Sciences Indexes and Abstracts 
(ASSIA), MEDLINE, Science Direct, Sociological Abstracts, Dissertation Abstracts, 
Database of Abstracts of Reviews of Effectiveness, and ERIC (Education Resources 
Information Center) . The following research registers were also searched: the 
Social, Psychological, Educational and Criminological Trials Register (SPECTR (for 
original search, not used in update), the National Research Register (NRR, research 
in progress), and SIGLE (System for Information on Grey Literature in Europe) . 
Finally, the reference lists of primary studies and reviews in studies identified from 
the search of electronic resources were scanned for any not yet identified that were 
relevant to the systematic review. All searches covered until J uly 20 11. 


3.1 CRITERIA FOR INCLUSION AND EXCLUSION OF 
STUDIES IN THE REVIEW 


Only studies that satisfy all of the following inclusion criteria and none of the 
following exclusion criteria were included in this review: 

Outcomes measured 

We focused this systematic review on outcomes related to juvenile delinguency. We 
included studies with outcome measures of juvenile delinguency, reported by the 
individual or by others, or derived from archival sources such as arrest or juvenile 
court records. We also included studies focusing on precursors of delinguency such 
as aggression or high levels of externalizing problems and studies with two outcomes 
that are correlated with and freguently co-occur with criminal involvement or 
delinguency risk (drug abuse and academic achievement/ school failure) . As noted 
above, the specific terms for each outcome are provided in Table 1. 


Types of participants 

J uvenile delinguency is typically defined as antisocial or criminal behavior by 
persons under age 18 (Tolan, 2002). In this systematic review of mentoring 
interventions, we included studies that involved youth who were included because 
they were currently showing behavior that would constitute juvenile delinguency or 
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were identified and included because th^ were "at-risk" for juvenile delinquency. 
At-risk is defined as the presence of individual or ecological characteristics that 
increase the probability of delinquency in later adolescence or adulthood (Tolan, 
2000). Ecological characteristics include family and parenting influences on 
behavior, residence in neighborhoods with high levels of poverty or crime, exposure 
to gangs, and other social setting factors (Tolan & Gorman- Smith, 2003). Individual 
characteristics include high scores on screening measures for aggression, evidence of 
oppositional defiant or conduct disorders, school failure, or attitudes and beliefs 
consistent with elevated use of aggression or antisocial behavior (Farrington, 2004). 
Demographic characteristics were not considered as designating at-risk for 
consideration for inclusion here. Thus, a study that targeted a demographic group 
even if doing so because they are considered at risk was not included unless selection 
met our criteria otherwise. 

Intervention Type 

We included interventions focusing on prevention and treatment (referred to as 
selective and indicated population interventions) . In the initial phase of study 
selection, we sought out any studies that described their interventions as mentoring, 
that mentioned mentoring as any part of their intervention strategy, or had 
interventions characterized by any of the four characteristics noted above, whether 
or not they specifically mentioned mentoring. 

Regarding the defining characteristic of absence of formalized role inequality, 
previous reviews have differed on the inclusion of studies using professionals as 
mentors. DuBois et al. (2002) excluded interventions using professional providers, 
with the exception that some studies that employed mental health professionals as 
mentors were included under certain conditions (see DuBois et al., 2002; Rhodes, 
2002 for those criteria) . This appears to also have been the approach used in the 
updated meta-analysis by DuBois et al., 20 11. We differed from these prior reviews 
by including studies with mental health providers as mentors if their involvement 
was unstructured or limited to a non-specific or support intervention (not 
psychotherapeutic) . Functionally this means inclusion here of some critical studies 
for the current focus that were not included in the DuBois review, such as the 
McCord Cambridge- Somerville study (McCord, 1978, 1979). 

We then excluded studies in which the intervention was explicitly 
psychotherapeutic, behavior modification, or cognitive behavioral training. Although 
we included studies in which mentoring was done as a part of another structured 
intervention, those studies that were conducted without providing results for the 
mentoring intervention separately were coded as including either an additional 
primary intervention (i.e., a major component in addition to mentoring) or an 
additional secondary intervention (i.e., a minor component in addition to 
mentoring). 
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In addition to requiring that studies investigate the effects of a mentoring 
intervention, as described above, we followed three additional criteria based on 
those used by Lipsey and Wilson ( 1998) in their meta- analysis of intervention effects 
on delinquency. We only included studies that measured at least one quantified 
outcome variable for the outcome of interest among the four considered here and 
that provided sufficient data to allow calculate an effect size and decipher its 
direction. When studies measured a delinquency- related outcome but did not report 
sufficient detail to allow calculation of an effect size, we attempted to contact the 
author to obtain additional information. Because of access to the Aos and Lipsey 
databases we had a relatively complete rendering of the studies from vdiich such 
information could be extracted. There were, therefore, very few studies that we were 
uncertain about whether additional information was obtainable. 

Research Design 

The second criterion for inclusion in this review was that the study design involves a 
comparison that contrasted an intervention condition involving mentoring with a 
control condition. Control conditions could be "no treatment," "waiting list," 
"treatment as usual," or "placebo treatment". To ensure comparability across 
studies we made an a priori rule to not include comparisons to another experimental 
or actively applied intervention beyond treatment as usual. However, there were no 
such cases among the studies otherwise meeting criteria for inclusion. 

We coded studies according to whether they were experimental or quasi- 
experimental designs. To qualify as experimental or quasi- experimental for the 
purposes of this review, we required each study to meet at least one of three criteria: 
1) random assignment of subjects to treatment and control conditions or assignment 
by a procedure plausibly equivalent to randomization; 2) Individual subjects in the 
treatment and control conditions were prospectively matched on pretest variables 
and/ or other relevant personal and demographic characteristics; 3) Use of a 
comparison group with demonstrated retrospective pretest equivalence on the 
outcome variables and demographic characteristics as described below 

Randomized controlled trials that met the above conditions were clearly eligible for 
inclusion in the review. At the other end of inclusion eligibility, sin^e- group pretest- 
posttest designs (studies in which the effects of treatment are examined by 
comparing measures taken before treatment with measures taken after treatment on 
a single subject sample) were never eligible. A few nonequivalent comparison group 
designs (studies in vdiich treatment and control groups were compared even though 
the research subjects were not randomly assigned to those groups) were included. 
Such studies were only included if they matched treatment and control groups prior 
to treatment on at least one recognized risk variable for delinquency, had pretest 
measures for outcomes on which the treatment and control groups were compared 
and found to be essentially equivalent. We required that non- randomized quasi- 
experimental studies employed pre- treatment measures of delinquent, criminal, or 
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antisocial behavior, or significant risk factors for such behavior, that were reported 
in a form that permitted assessment of the initial equivalence of the treatment and 
control groups on those variables. 

Time Period and En^ish Language Criteria 

We limited the review to those studies conducted within the United States or 
another predominately English-speaking country and reported in En^ish. This was 
because we did not have resources for translation of studies not published in En^ish 
and the vast majority of programs were conducted in the United States. J uvenile 
subjects did not need to speak English. A study conducted in the United States or 
Canada with resident Hispanic youth, for example, could have been included. 

We limited the review to studies published since 1970 . The time frame between 
1970 and the present (time of completion of search to conduct coding, 2011) is 
consistent with start of the time interval used by the review of the literature on 
delinquency conducted by Lipsey and WUson ( 1998) and others. This also is the 
time period for most almost all the available studies with the necessary information 
and design features to be included in this review. 

Coding of Article Characteristics 

We double-coded 20% of the new articles (N=32), and calculated inter-coder 
reliability coefficients for study type (e.g., randomized trial), study equality, 
participant selection criteria (e.g., individual or behavioral risk), mentor motivations 
(e.g., survivor of abuse, professional development), and intervention components 
(e.g., modeling, teaching) using Cohen's kappa. We found high reliabilities for study 
t5q>e (k = i.o), study quality (k = .93), and selection criteria (k = .81). Coders easily 
determined some mentor motivations such as personal experience that connected to 
the youth needs (e.g. experienced abuse) (k = .90), but were less certain with topics 
such as civic duty or professional development (k = .68). Not all categories were 
coded in the random sample of studies that were double coded. Eor example, of the 
mentoring components (modeling/ identification, teaching, and emotional support) 
only modeling was foimd in the studies randomly selected for double coding. Einal 
kappa reliabilities all were above .6, a level Landis and Koch ( 1977) suggested 
represented full agreement. Coders sought consensus with their supervisors, 
particularly on difficult- to-code categories such as mentor motivations. If this could 
not resolve differences then author Schoeny made a decision about categorical 
coding. 

Effect sizes for outcomes were also double- coded for 20% of the new articles. There 
were no substantial variations in these (r = .99) with only one disagreement. As 
with other coding decisions we first attempted to resolve bases for differences (e.g. 
technical inconsistency that if corrected removed difference) . We had a protocol in 
place to then structure discussion of differences to attempt to reach consensus. If 
necessary a decision would then be made by either the first author (Tolan) or author 
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Henry. Given the level of agreement we did not have to proceed past the technical 
comparison and a brief discussion to reach consensus. 

We conducted a separate meta- analysis for each outcome (delinquency, aggression, 
drug use, academic achievement) . Each grouping of studies was based on the 
outcome, such that some studies might be included in more than one meta- analysis 
due to measuring more than one outcome. Thirteen studies reported more than one 
outcome, four of which had three outcomes. A single outcome measure was used for 
each study for a given outcome category. No studies reported multiple measures of a 
single outcome (e.g., multiple measures of delinciuency or aggression) . 

Statistical Procedures. 

Effect Size Calculations: For this study we used inverse- variance meta- analysis with 
a random- effects model, performed and plotted through the metagen package in the 
R statistical language. The random effects model addresses the research question of 
whether the average effects of an intervention in the population are significantly 
different from zero (Bailey, 1987; Raudenbush, 1994). 

The inverse variance method, as its name suggests, weights individual studies by the 
inverse of variance of their effect size. Thus, this method requires the calculation of 
standard errors of the effect sizes. For this purpose, we estimated variances for each 
effect size according to Hedges and Olkin's (1985 , p. 86) Formula 14: 

^ 2 _ (ne + nj ^ 

(Ug * nj (Ue + nj 

Where Odi^ is the estimated variance of the effect size, ne is the number of 
experimental subjects, mis the number of control subjects, and di^ is the square of 
the effect size of the study. 


The standardized mean difference effect sizes of the interventions under evaluation 
were calculated in units of Hedges' ( 198 1) g. For studies reporting means, standard 
deviations, and Ns of numeric data, the effect size was calculated by dividing the 
treatment difference less the control difference over the pooled treatment and 
control standard deviation: 


SMD 


{{ME2 - MEl) - (MC2 - MCI)) 


{{NE2 - 1)5C12 + {NC2 - 1)5C12) 
{NEl + iVCl - 2) 


where: M=mean S = standard deviation 

E = treatment C = control 

1= pretest 2 = posttest 
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For studies that reported dichotomous outcomes, we calculated odds ratios and 
converted them into an equivalent standardized mean difference effect size estimate 
(Lipsey &WUson, 1998). Chinn (2000) noted that dividing the natural log of an 
odds ratio by u/ VS produces an excellent approximation of the standardized mean 
difference effect size. 

We also applied a correction to all effect sizes that compensates for small sample 
bias: 


^ 4 (jli + 722 ) — 9 

We examined funnel plots from each meta analysis for visual evidence of 
asymmetry, and conducted Egger tests (Egger, Smith, Schneider, &Minder 1997) to 
obtain a statistical test for asymmetry. The Egger test fits a regression of the 
normalized effect estimate (estimate divided by its standard error) against precision 
(the reciprocal of the standard error of the estimate) . 

We conducted analyses to determine whether the effects of the mentoring 
interventions varied by five key aspects of the intervention approach and 
characteristics. Potential moderators that were tested were: 

1) selectivity in inclusion (high individual risk, high environmental risk, or no such 
selectivity) 

2) whether or not mentoring is a stand- alone approach in that study or was 
imdertaken along with a) some other major intervention components or b) some 
relatively minor add-ons 

3) the motivation of the mentors in participating (dvic duty, professional 
development, own experience) 

4) the extent to which quality of work and fidelity were assessed or emphasized. 

5) explidt attention to presence of four key processes: modeling/ identification 
promotion, emotional support, advocacy, and teaching 

Inspection of the coding across studies indicated that we had to simplify some 
moderation analyses due to sparse or no studies noting a particular characteristic of 
interest. For selection of partidpants, none of the interventions were coded as a 
universal, thus, imder selection we could only test for moderation by the presence or 
absence of selection for individual risk and selection for environmental or ecological 
risk. We could not consider personal experience as a motivation as there were no 
studies in which this was measured or was able to be coded. Thus moderation tests 
of mentor motivations were conduded separately for presence or absence of dvic 
duty and for professional development as motivation. 

Only the tests of indusion of other interventions with mentoring included all 46 
studies. Other moderator analyses were limited by whether coders could determine 
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whether the moderating factor was present or absent. The analysis of vdiether 
motivation by dvic duty significantly moderated effect sizes included 36 studies, 
which was the smallest number of studies in any of the moderation analyses. 

To conduct the moderated analyses we utilized all studies across the four outcomes 
to calculate an overall effect size by moderator condition (i.e., the mean of all effect 
sizes reported in each study) . This was done because of the limited number of 
studies for testing moderators available even if examined for each outcome 
separatdy. We also reasoned that the interest was in testing moderation of 
mentoring for studies of delinguency and/ or the related outcomes rather than for 
each specific outcome. That is, this meta- analysis is focused on youth at-risk for 
delinguency from the view that the four outcomes are related in sharing risk factors 
and likely impact of mentoring features. This approach has been used in other 
meta-analyses where multiple outcomes are of interest (see DuBois et al., 20 11) . In 
addition, given the power strain moderation analyses can impose on data sets 
limited in size like this one, as has been done by others we utilized a p level of .05 
(one-tailed test) . This standard was also used for these analyses because in each 
case we expected larger effects if the moderator was present, and the specific order 
of the levels of the moderator was not at issue. This is eguivalent to a two-tailed p < 
. 10 vdiich has been justified given the power challenges for moderation effects 
(Wilson &Lipsey, 2007). 

We tested for moderation with meta- regression analysis using the rma function in 
the metafor package in R (Viechtbauer, 2010). Each meta- regression analysis 
employed a random effects model that included terms for the moderator under 
consideration and a term representing whether the study was a randomized design 
or a guasi- experimental trial. The significance tests are one-tailed Z- tests. 

We also conducted sensitivity analyses to assess the effects on conclusions of 
changes made to the inputs of an analysis (Morgan &Henrion, 1990). Accordingly, 
we conducted analyses to determine ( 1) the consistency of effect sizes obtained with 
different outcome variables, and (2) the consistency of outcomes within different 
levels of moderated analyses. 
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4 Results 


4.1 MAIN EFFECT META-ANALYSES RESULTS 


In the updated review's first phase of the literature search we identified additional 
studies to accumulate a total of 164 studies that were further evaluated for basic 
criteria for outcome and intervention type. Of these studies, 58 (34%) were 
determined to have none of the target outcomes. The remaining 107 were subjected 
to further scrutiny in order to determine their methodological suitability for the 
meta- analysis. Of these 53 (33%) had research designs that did not meet minimum 
guality standards for inclusion and 6 (4%) did not provide sufficient information for 
calculating effect sizes related to the outcomes in guestion This left 46 (28%) studies 
that were included in the guantitative review. The 118 excluded studies can be found 
in Table 7. 

Table 3 provides details on the 46 studies selected for the meta- analysis, including 
citation, sample characteristics, design type, component and intervention 
information obtained for moderation analyses, and basic findings. Of the 46 studies 
included, 27 were randomized controlled trials and 19 were guasi- experimental 
studies involving non- random assignment, but with matched comparison groups as 
was described above. Twenty- five studies reported delinguency outcomes, 25 
reported academic achievement outcomes, 6 reported drug use outcomes, and 7 
reported aggression as an outcome. 

Prior to calculating the mean effect size, we evaluated the heterogeneity of study 
effect sizes using multiple homogeneity measures, standard errors, and associated 
probability levels, including Cochrane's Q, and 1 2 (Higgins, Thompson, Deeks & 
Altman, 2003) . Cochrane's Q is an indicator of heterogeneity that is distributed as a 
chi-sguare. Significant values of Q indicate heterogeneity. The degree of 
heterogeneity can be seen in the P statistics. This indicates the approximate 
proportions of variance across compared studies that are due to heterogeneity of 
effects.! 


1 In a sensitivity analysis we tested for influence of studies vtith multiple outcomes on effects and found 
that the effect sizes in studies with sin^e outcomes (SMD =0.27, 95% Q =0.12 - 0.41) wereshghtly 
but not significantly higher than the effect sizes in studies with multiple outcomes (SMD =0.22, 95% 

Q =0.07- 0.38). Cross-tabulation of multiple outcomes by moderator variables revealed a sin^e 
significant difference. Studies with a sin^e outcome were more hkely to have selected for 


28 


The Campbell Collaboration | www.campbellcollaboratlon.org 


We inspected forest plots of the effects and confidence intervals to explore for 
potential outlying studies. Our procedure was, after identifying possible outlying 
studies we repeated the meta- analyses, in order to determine whether removal of up 
to five outlying studies would reduce or eliminate the heterogeneity. 

As can be seen in Table 4, heterogeneity of effects was substantial for delinguency 
and academic achievement. Also, examination of forest plots and re- analysis with 
removal of outlying studies did not reduce appreciably the heterogeneity of effects of 
mentoring for either delinguency or academic achievement. It seems evident there 
is substantial heterogeneity among studies in effects for delinguency and academic 
achievement. 

In order to assist in understanding the heterogeneity in effect sizes, we conducted an 
analysis to determine whether the effect sizes differed substantially between 
randomized controlled trials (RCTs) and guasi- experimental designs. Using meta- 
regression with study design as the predictor, we found that although effect sizes 
were numerically larger in RCT s for all outcomes except drug use, none of these 
differences was statistically significant (Hedges &Pigott, 2004, formulas 11- 12, p. 
432). 

For each outcome we calculated an average effect size and 95% confidence interval 
and a related Z statistic. To facilitate interpretation, we scaled all outcomes so that 
positive effect sizes represent effects in the desired direction, i.e., lower delinguency, 
aggression and drug use, higher academic achievement or lower school failure. 

Table 4 reports the results for the meta- analysis for each of the four studied 
outcomes. 

Delinquency 

As can be seen in Table 4 the 25 studies with a delinciuency outcome yielded an 
average effect size of SMD = .21. (Range: -0.25 to 1.73; 95% confidence interval 0.17 
to 0.25; p < .0 1). Heterogeneity was substantial as indicated by 12 of 99.3% (Q (24) 

= 3297.64, p < .01). Examination of a funnel plot for delinciuency revealed some 
asymmetry involving the three studies with the largest effect sizes, and an Egger test 
confirmed the presence of asymmetry (bias = 6 . 79, t (23) = 2. 74, p < .05) . We 
conducted a sensitivity analysis by removing these studies and repeating the meta- 
analysis. The difference was very slight. With the full sample, the SMD from the 
random effects model was 0.21 (p < . 001 ; x 2 = . 008 ). With the reduced sample the 
SMD from the random effects model was 0.19 (p < . 001 ; x 2 = . 008 ). Finally, we 
applied the trim and fill method (Duval &Tweedie, 2000) to accoimt for publication 
bias in the random effects estimate. The result was an estimated effect of 0. 18 (p < 
. 001 ; x 2 = . 009 ). 


environmental or ecological risk than were studies that reported multiple outcomes, x2 (i, N=36) = 
3.94, p <.05 
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Aggression 

As can be seen in Table 4 a random effects model of the seven studies with 
Aggression outcome yielded an average weighted effect size of SMD = .29 (Range: - 
0.05 to 0.95; 95% confidence interval: -0.03 to 0.62, ns) . The funnel plot for 
Aggression revealed no asymmetry and the Egger test co nfir med this impression 
(bias =-1.41, t (5) < 1, ns). 

Drug Use 

As can be seen in Table 4 a random effects model of the six studies with Drug Use 
outcome yielded an average weighted effect size of SMD = . 16 (Range: - 0 . 13 to 0 . 18; 
95% confidence interval: 0.04 to 0.29, p =.05). On drug use, there appeared to be 
funnel plot asymmetry due to the sin^e negative effect, but the Egger test did not 
find evidence of bias (bias = 16.41, t (4) < 1, ns). Removal of this effect in a 
sensitivity analysis resulted in stronger combined effect (Full sample: SMD = . 16, p 
= .05, t 2 = .04; Reduced sample: SMD =. 19, p < .001, = .0002). 

Academic Achievement 

As can be seen in Table 4 the 25 studies with Academic Achievement outcome 
yielded an average effect size of SMD =.11 (Range: -0.04 to 1.45; 95%confidence 
interval: 0.03 to 0.31). On academic achievement, graphical examination suggested 
that there might be funnel plot asymmetry due to three studies with large effect 
sizes. Removal of these effects in a sensitivity analysis resulted in a weaker, but still 
significant combined effect (Full sample: SMD=. 11, p < .000 1, = .006; Reduced 

sample: SMD =05, p <.01, =.005). An Egger test of bias found no evidence of 

bias with the full sample (bias=4.55, t (23) = 165, p = .11). 

Average Effect 

Table 4 also reports the average effect, which was used for the moderation analysis. 
The46 studies yielded an average effect size of SMD =.18 (Range: -0.21to 170; 95% 
confidence interval : 0 . 15 to 0 . 2 1) . 

We also created forest plots for each outcome to show the variation in individual 
studies about the aggregate effect size. These are the effect sizes from inverse 
variance weighted random effects models. These are provided, with accompanying 
statistics, in Figures 1-4, corresponding to Delinguency, Aggression, Drug Use, and 
Academic Achievement respectively. Across the four outcomes the pattern is one of 
relatively consistent direction and size of effect sizes within a given outcome, but 
with a few studies showing confidence intervals that include zero or negative effects 
for each outcome. 
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The patterns of effect sizes and the Forest Plots suggest the average effect sizes 
represent robust estimates of mentoring on each outcome. The aggregate effect size 
estimates, although modest, are all positive. 


4.2 MODERATOR ANALYSES 


We conducted analyses to determine whether the effects of the mentoring 
interventions varied by four key aspects of intervention design and implementation 
and of four key processes theorized as avenues for mentoring effects: 

1) selectivity in inclusion (high individual risk, high environmental risk, or no such 
selectivity) 

2) whether or not mentoring is a stand- alone approach in that study or was 
undertaken along with a) some other major intervention components or b) some 
relatively minor add-ons 

3) the motivation of the mentors in participating (dvic duty, professional 
development, own experience) 

4) the extent to vdiich guality of work and fidelity were assessed or emphasized. 

5) explidt attention to presence of four key processes: modeling/ identification 
promotion, emotional support, advocacy, and teaching. 

As noted earlier we combined across outcomes and for these analyses given the 
constrained sample sizes and used a test as all moderator analyses tested the null 
hypothesis that the effed of the moderator was zero, regardless of vdiich level of the 
moderator was coded "1" and which was coded "0". To check on the validity of 
combining across outcomes we tested for bias in effects due to this aggregation (e.g. 
effects are limited to one outcome or heavily dependent on specific outcome). To do 
so we conducted two sets of sensitivity analyses. For the first set of analyses, we 
employed Hedges and Rgott's (2004, formulas 11- 12, p. 432) method for contrasting 
group mean effect sizes in meta- analysis to contrast effect sizes from studies 
reporting delinguency outcomes against those reporting each outcome against those 
reporting on the other three outcomes. These results produced no evidence that 
effect sizes differed substantially by any given outcome, vdiich would mean 
moderation relations were not due to a true relation with only a single outcome, Z 
(delinguency- aggression) =-0.17, ns; Z (delinguency- drug use) =1.61, ns; Z 
(delinguency- academic) = 1.77, ns; Z (aggression- drug use) =0.74, ns; Z 
(aggression- academic) =0.81, ns; and Z (academic-drug use) =-0.07, ns. We also 
coded outcomes of each study according to the outcome variables used (e.g, 1-4 = 
Delinguency, Aggression, Drug Use, Academic Achievement). We then cross- 
tabulated these codes with categorical scores for whether a given moderator could be 
coded. No significant results were obtained. Only one moderator, professional 
development as a motivation for mentoring, showed any such tendency, with a 
marginally higher than expected freguency by outcome (for academic achievement) 
xl (5, n=36) = 11.05, p < .05. These results suggested to us sufficient confidence 
that moderation analyses collapsed across outcomes would be not biased or 
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misrepresenting an overall relation for mentoring programs. In combination with 
the practical consideration of sample size limitations we judged this an appropriate 
way to serve the goals of the review with the available studies. 

We tested for moderation using two methods. First, we calculated meta- analysis 
statistics separately by levels of the moderators (Hunter & Schmidt, 2004, p. 402) . 
Table 6 reports the standardized mean difference effect sizes by levels of each 
moderator, the number of studies in each level of the moderator, and the lower and 
upper limits of the 95% confidence intervals for each random effect estimate. Table 
6 also reports the moderator effect estimates, standard errors, and significance tests 
from the meta- regression analyses described above. 

As can be seen in Table 6 there was significant moderation for Motivation for 
Mentoring but not for other program organization and implementation features. 

We provide plots for Mentor Motivation in Figure 5. As can be seen in Figure 5 
effects were larger when mentor motivation was based in professional development. 

In regard to key processes of mentoring interventions, there was evidence of 
significant moderation by the presence of two component processes in mentoring: 
Advocacy and Emotional Support ( See Table 6) . The results are illustrated in Figure 
6. Stronger effects were observed when Emotional Support and Advocacy were 
components of mentoring than when these components were not present. Figure 6 
suggests that stronger effects were observed vdien teaching was a component of 
mentoring, but the meta- regression that included a term for research design did not 
return significant evidence of moderation. 
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5 Conclusions 


This review of the methodologically adequate studies released between 1970 and 
20 11 and focused on primarily United States population testing mentoring for high- 
risk youth found positive effects for delinquency and for three other associated 
outcomes: aggression, drug use, and academic performance. These findings 
suggest mentoring is beneficial for at-risk youth to reduce delinquency, aggression, 
substance use, and to improve academic functioning. In addition, we found that the 
size of the effects varied by some key features, vdiich include Mentor's motivation for 
being a mentor (those with interest in professional development had large effects) 
and for two of four theorized key processes were part of the mentoring effort 
(Advocacy and Emotional Support, with strong suggestion for Teaching) . While 
showing these overall effects, for each outcome and among the studies with the 
beneficial features, there was substantial variation in effect sizes. 

The effects are significantly different from zero for all four outcomes. However, all 
were modest in size (ranging from . 11 for Academic Achievement to . 16 for drug use, 

. 2 1 for delinquency and . 29 for aggression) . These effect sizes are comparable to 
other interventions aimed at high- risk youth for each outcome. 

These results suggest mentoring, at least as represented by the included 
studies, has positive effects for these important public health problems with 
those at risk for delinquency. As this portion of the population can be of 
particular interest given the problems their elevated risk for not just 
delinciuency but many other areas of functioning, the evidence of mentoring 
having significant effects, even if modest in size, suggest it could be part of 
the strategies to try to prevent actual engagement in delinciuency and drug 
use and to curtail or prevent aggression and poor academic achievement 
(Tolan & Gorman- Smith, 2003). In addition, there was substantial 
heterogeneity in effect size across programs for each outcome suggesting 
there may be more substantial benefits that could be gained when mentoring 
is organized in ways that maximize those features associated with larger 
effects. 

However, there were several limitations of the available literature that preclude 
statements about vdiat makes mentoring most effective or vdiat accounts for 
benefits. Perhaps most notably, the collected set of articles is remarkably limited in 
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describing the actual program activities, what was expected and not among a range 
of potential mentoring activities, and how key implementation features were 
organized, trained, and/ or assessed for competence and fidelity. Unfortunately this 
state of reporting detail and completeness does not seem to be improving such that 
more recent publications are clearly more informative. 

This longstanding concern is part of what prompted the formulation of key 
processes and attempted coding of these in this review. As we noted in the 
introduction and as we attempted to code, there are key characteristics thought to 
distinguish mentoring from other helping relationships and to be the basis for 
benefits. Therefore, these gualities should be common across studies and their 
guality relate to effect size. However, for a significant portion of studies description 
of the intervention content, organization, and/ or implementation was insufficient to 
code one or more of these important characteristics. This state of the reporting of 
details about intervention constrained sensitivity of our moderation analyses and 
completeness of the comparisons for the body of research considered here. 

The notable lack of adeguate reporting of specific components, implementation 
procedures and adherence, and measurement of targeted processes to permit 
comparison on these important features is seen as a major impediment to advancing 
knovdedge about the value of this popular approach to youth intervention. It may be 
that full potential of the approach is not being achieved, as what may improve effects 
is difficult to discern. Importantly, there is limited ability to meta- analytically 
determine what characteristics of mentoring programs and which approaches are 
most advantageous and might provide direction for more effective programs. Thus, 
we have limited ability to suggest specific priorities for further study. 

We were able to conduct some moderator analyses despite these limitations. The 
results for tests of several features of organization and implementation of mentoring 
suggest that effects were larger when mentors were motivated to participate by 
interest in advancing their professional careers. This is an important finding as 
most mentoring is undertaken as volimtary activity. In some cases the mentoring 
may help a mentor by fulfilling reguirements at work, as an entry level position 
toward a professional staff position, or by enabling experience that can make them a 
more attractive candidate for educational or occupation opportunities. While 
beyond the scope of this review, the results may also raise guestions about the 
presumption that mentoring should not be done other than as a volimtary activity. 

Although the review focused on selective and indicated populations (those with risk 
characteristics or already exhibiting delinguency as a basis for inclusion) we did not 
find moderation by vdiether inclusion depended on individual risk characteristics or 
environmental or other- than- individual characteristics. While we are duly cautious 
about interpreting these null effects, the finding may suggest that either approach 
may be viable for effective targeting. 
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We also did not find effect differences by whether or not other interventions were 
included with mentoring or mentoring was part of a multi- component intervention 
than vdien it was offered on its own. This leaves open whether or not the effect vdien 
other interventions are present is attributable to mentoring but does suggest that 
mentoring at least as represented in these collected studies, has effects apart from 
those attributable to other interventions. Within the overall concern about the 
guality of information about mentoring programs there is much need to consider 
designs that might consider mentoring singularly and as part of a package or in 
comparison to other singular interventions. This could not only help clarify the 
relative importance of other components but also the relative value in comparison to 
other interventions that might be alternatives. As issues such as cost effectiveness, 
ease of training and implementation, and sustainability come into consideration, 
such information is increasingly important. 

Similarly, we did not find differences by whether or not extent and fidelity of 
implementation of expected activities and program features was measured (?). 

While vdiat comprises a mentoring program to test fidelity against is in some cases 
not clear, the impression from the limited number of studies we could code for this 
is that this field is behind others in such design and evaluation considerations. As 
with the other factors noted here, more attention to this would likely improve 
understanding and efficiency of program advancement. 

Moderation tests of four key processes foimd to be mentioned freguently in the 
literature and in description of some programs foimd that at least two matter in 
regard to effects. Programs that included emphasis on emotional support and those 
that emphasized advocacy for the recipient had larger effects. While teaching and 
modeling/ identification did not significantly relate to effect size, there was some 
suggestion these may be worthwhile fod of attention in mentoring design. Perhaps 
with more studies that could be coded and more attention to documentation of such 
processes, the role of these four processes can be better delineated. The present 
results suggest programs might want to ensure emotional support from the mentor 
is emphasized but also methods and opportunities to advocate could also be helpful. 
Our results in regard to the latter are consistent with those reported by DuBois et al. 
(20 11) for mentoring in general when measured across many outcomes. 

These findings are consistent with prior meta- analyses that overlap in focusing on 
mentoring. As reported by Lipsey and Wilson ( 1998) and DuBois et al. (2002, 20 11) 
these analyses suggest general support for mentoring for intervention related to 
delinguency and closely assodated outcomes. However, as those analyses found, the 
information obtainable about the "inside" of these interventions termed mentoring 
is limited. Thus, the conclusions to be drawn must remain very sketchy about what 
it is that makes mentoring effective. This persistent charaderistic of the field 
undercuts ability to recommend it for use, as it is not clear what should be 
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recommended. Further, while the positive effects suggest promise, the lack of 
standard types of information and formal approaches to documentation that 
characterizes the best studies in most areas of behavioral intervention seriously 
impedes incremental progress in best practices. Thus, while consistent with prior 
findings, there seems to be little additional certainty of the nature of mentoring and 
information to guide further development, sound training and management of the 
programming, and adeguate tracking of effects to activities, staffing, and other 
features. Unfortunately this seems to be gualitatively the same state of need as was 
identified in our consideration of mentoring in a review of violence prevention 14 
years ago (Tolan & Guerra, 1994) . This is not the case for most areas of delinguency 
intervention. 

This lack of progress and lack of attention to intervention design features and 
program characteristics is particularly of note because mentoring is one of the most 
common and most favored approaches for prevention of risk and youth 
development. It is also one with considerable presence in the scientific literature. 
While of the 164 studies located only 46 met criteria for inclusion, this does not 
mean the other 118 were of no value for informing science. Yet, after reviewing these 
we do note they are not marked by more detailed attention to these conventions of 
design and reporting that have helped advance prevention and intervention 
capabilities for other approaches. Given the prominence of mentoring in attempts to 
address these critical public health and youth problems, such a lack of systematic 
attempts to impack mentoring and to understand it within a conventional 
framework for evaluating interventions is surprising. It is also striking that funding 
and promotion of these efforts proceed without more stringent evaluation, including 
more careful identification of population of interest, inclusion criteria, skills and 
training of providers, content and theorized processes of component effects, fidelity 
tests, and implementation levels for intent to treat. 

Thus, we can only suggest some tentative and general statements about what might 
affect mentoring impact. Perhaps the more striking statement to be made is that 
despite its popularity and the apparent benefits it provides, there is little 
imderstanding of just what makes an intervention mentoring and what about such 
labeled interventions is related to benefits derived. Perhaps most fundamentally the 
CO- occurring popularity and the general promise of these findings point to the 
critical need for concerted efforts for substantial and probably large-scale 
evaluations. These are needed to efficiently provide more clear and directing 
information about what about mentoring is the reason positive effects are derived. 

In particular it may be that the promise suggested in the modest effect sizes yielded 
here is only a base estimation of potential benefit. Similarly, the suggestion that 
some design features and some emphases are related to larger effects may only point 
to the potential gain that could come from more careful and concerted formalization 
of intervention evaluation. 


36 


The Campbell Collaboration | www.campbellcollaboratlon.org 



6 Plans for updating the Review 


The review will be updated every 5 years. 
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8 Tables and Figures 

Table 1 

Categories and Variables for Meta-Analysis 
Composite Category Variables 


Delinquency 

Self-reports of delinquency 
School conduct reports 

Teacher report form (TRF =) or teacher BASC Delinquency scales 
Arrest records 
Court records 

Aggression 

Peer nominations of aggression 
Teacher reports on the TRF of BASC 
ParentCBCL<:orBASC reports 
Self-reports 

Behavioral Observations 

Substance Use 

Self-reports (e.g., SRD) 
Arrest records 
Court records 
Teacher reports 
Parent reports 

Academic Achievement 

School grades 

Standardized test scores (e.g., ITBS 
Self-reports 


Archival graduation or withdrawal records 
3 TRF =Teacher Report Form of the Child Behavior Checklist (Achenbach, 1991) 
BASC = Behavioral Assessment System for Children (Reynolds & Kamphaus, 1992) 
CBCL =Child Behavior Checklist (Achenbach, 1991) 

ITBS =lowa Test of Basic Skills (IFlieronymous, Floover& Lindquist, 1986) 


58 


The Campbell Collaboration | www.campbellcollaboratlon.org 


Table 2. Combinations of Search Terms Used 


Mentor 

Role Model Modeling 

Interpersonal 



Relationship 


Delinquency 


Intervention 

delinquency 
and mentor and 
intervention 

delinquency and 
modeling and 
intervention 




Outreach Program Trial 


delinquency and 
modeling and trial 


Aggression 


Intervention 


Outreach Program 

aggression 
and role 
model and 
outreach 
program 


Trial 



Psychoeducational 


aggression and 

Methods 


interpersonal 
relationship and 
psychoeducational 
methods 


Note: Combinations shown for delinquency and aggression outcomes only. Similar searches were performed 
for substance use and academic achievement. Derivative forms of each term were also considered. 
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Table 3 (in 3 sections) details of 46 studies included in Meta- Analysis 




Effect Size 

Sample Size 

Citation(s)^ 

Quality Delinq. 

Agg. 

Acad. Subs. 

Tx 

Con Outcomes 

Abbott, Meredith, Self- 
Kelly, & Davis (1997) 

3 0.07 

-0.05 

0.42 

22 

22 Revised Problem Checklistfoi 
conduct disorder and socializt 
aggression: school grades 

Aiello (1988) 

3 


-0.14 

55 

42 GPA 

Anderson (1977) 

3 -0.14 



76 

76 severity of subsequent offensi 

Aseltine, Dupre, & 
Lamlein (2000) 

3 


0.01 .19 

76 

118 self-reported grades and 
substance use 

Barnoski (2002) 

3 0.22 



78 

78 criminal recidivism 

Berger & Gold (1978) 

5 0.07 



46 

18 Self-reported frequency of 
delinquency 

Bernstein etal (2009) 

5 -0.003 


-0.03 

1163 

1197 Self report of delinquency, scf 
report of disciplinary action, 
school records of grades ( ma 
english, science, social studie 

Blechman, Maurice, 
Buecker, & Heiberg 
(2000) 

3 -0.18 



45 

137 Post-intake rearrest 

Brooks (1995) 

3 


-0.21 

23 

19 GPA 

Buman &Cain (1991) 

3 0.16 


0.26 

137 

107 High School Graduation, polic 
arrest record. 

Cavell & Hughes (2000) 

5 

0.02 


31 

29 CBCL Aggression scores. 

Clarke (2009) 

4 -0.24 


0.80 

14 

11 Subject GPA, self report of 
"negative school behavior" 

Converse & 
Lingugaris/K raft (2009) 

3 -1.24 



16 

15 Discipline Referrals 


60 


The Campbell Collaboration | www.campbellcollaboratlon.org 


Davidson (1976) 
Davidson & Redner 
(1988) 

Davidson, Seidman, 
Rappaport, Berck, Rapp, 
Rhodes & Herring (1977) 
Ku&Blew (1977) 
Seidman, Rappaport& 
Davidson (1980) 

5 

1.70 


25 

12 Police records. 

Davidson (1976) 
Davidson & Redner 
(1988) 

Davidson, Seidman, 
Rappaport, Berck, Rapp, 
Rhodes & Herring (1977) 
Ku&Blew (1977) 
Seidman, Rappaport& 
Davidson (1980) 

5 

0.95 


12 

12 Police records. 

Davidson & Redner 
(1988) 

Davidson, Amdur, 
Mitchell & Redner (1990) 

5 

0.60 


175 

85 Police Records. 

Davis (1988) 

5 

0.15 


20 

20 GPA 

Dicken, Bryson, & Kass 
(1977)^ 

5 


0.29 

20 

12 Parent/teacher reports of chile 
aggression 

Dicken, Bryson, & Kass 
(1977)^ 

5 


0.38 

22 

8 Parent/teacher reports of chile 
aggression 

Flaherty (1985) 

5 


0.00 

21 

21 GPA 

Fo& O'Donnell (1972) 
Fo& O'Donnell (1975) 
Fo& O'Donnell (1979) 
O'Donnell, Lydgate, & Fo 
(1979) 

5 

-0.11 


335 

218 Arrests 

Grant, 2010 

4 


0.02 

13 

13 Grades 

Grossman & Tierney 
(1998) 

Grossman & Rhodes 
(2002) 

Rhodes, Grossman, & 
Resch(2000) 

5 

0.08 

0.11 0.18 

487 

472 self-reported drug use; self- 
reported aggressive behavior; 
GPA 
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Hanlon, Bateman, Simon, 
O'Grady, & Carswell 
(2002) 

3 

0.25 

0.18 

214 

214 Self-reported delinquency anc 
substance use 

Harmon (1995) 

5 


.34 

43 

38 Self-reported substance use. 

Hayes (1998) 

3 


0.32 

60 

25 GPA 

Herrera, Grossman, 
Kauh, Feldman, 
McMaken &J ucovy 
(2007) 

4 

0.01 

0.02 

564 

573 GPA 

Holt, J ohnson & Bry 
(2008) 

4 

-.25 

-0.02 

16 

18 GPA & Discipline Referrals 

Johnson (1997, 1999) 

3 


-0.03 

135 

171 GPA 

Karcher, 2008 

3 


-0.06 

236 

232 Grades 

Keating (1996) 
Keating, Tomishima, 
Foster &Alessandri 
(2002) 

3 

0.29 0.71 


34 

34 Self Report of Delinquency: C 
BehaviorChecklist(CBCL) 

Kelley (1973) 

5 


0.28 

27 

22 GPA 

Kelley, Kiyak, &Blak 
(1979) 

3 

0.49 


65 

63 Police contacts. 

Kemple & Scott-Clayton 
(2004) 

5 


0.08 

729 

729 GED 

Lattimore, Mihalic, 
Grotpeter, & Taggart 
(1998) 

5 

0.43 


56 

44 High School Graduation 

LoSciuto, Rajala, 
Townsend, &Taylor. 
(1996) 

Taylor, LoSciuto, Fox, 
Hilbert & Sonkowsky 
(1990, 1999) 

3 


.20 

180 

193 Frequency of substance use 
during the past 2 months. 

Maxfield, Schirm, & 
Rodriguez-Planas (2003) 

5 

0.04 

-0.04 -.13 

580 

489 GPA, Self-reported alcohol us 
self-reported criminal behavio 


62 


The Campbell Collaboration | www.campbellcollaboratlon.org 



McCord (1978, 1979) 

5 

-0.03 


253 

253 Criminal records. 

Moore & Levine (1974) 
Moore (1987) 

5 

0.80 


50 

50 Police/court records. 

Newton (1994) 

5 

0.71 

0.93 0.05 

21 

27 Violent incidents at school: 
Grade point average: school 
exclusions (suspensions) 

Polit, Kahn & Stevens 
(1985) 

Quint (1991) 

3 


0.04 

270 

405 School Completion 

Reyes &J ason (1991) 

5 


0.07 

77 

77 Standardized test scores. 

Rowland (1992) 

3 


-0.05 

42 

44 School Grades 

Royse (1998) 

5 

0.35 

1.43 

25 

21 Disciplinary infractions and Gf 

Schinke, Cole, & Poulin 
(2000) 

3 


0.62 

94 

94 GPA 

Watson (1996) 

5 


0.22 

69 

25 GPA 


^ We include citations for all articles reporting results of the same studies. 

Based on Lipsey & Wilson, we report articles based on Davidson (1976) as two separate studies. 
Dickson, Bryce, & Kass (1977) reported separate analysis for males and females. Without sufficient 
information to combine these effects, we report them as separate outcomes for the meta-analysis. 
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Table 3 (second section) details of 46 studies included in Meta Analysis (cont'd) 


Citation(s) 

Sample Characteristics (Mentees) 

Ages of 
Mentees 

Sample Characteristics (^ 

Abbott, Meredith, 
Self-Kelly, & Davis 
(1997) 

Boys from mother headed, single-parent households. 8-14 yrs., 
mean age 10 years., not diagnosed with mental or physical 
disabilities. 

8-14 

Midwestern affiliate of the B 
America. College educated. 
Brothers. 

Aiello (1988) 

underachieving students 

middle 

school 

education staff members 

Anderson (1977) 

Majority (97%) between 13-17 yrs., 69% male, referred by] uvenile 
Dept for either criminal offenses or dependent-incorrigible 
(runaway, truant), pg. 49 - Age: 14.13, Severity of Original offense: 
3.71 (scale 1-5), 51% male, 48 % female. 

13-17 

Volunteers recruited througl 
engagements, and universil 
interviews, and identified as 
helping someone have a pn 
professionals, between 26-: 

Aseltine, Dupre, & 
Lamlein (2000) 

Low income 6th grade students living in large urban setting. 

12 to 13 

Adult mentors overage 50. 

Barnoski (2002) 

j uveniles from juvenile confinement M inimum of 5-6 months in 
J uvenile confinement remaining, non-sex offenders. 

Under 18 

Trusted adult volunteer to h 
vocational goals, and live di 
year commitment recruited 
groups, internet and from p 

Berger & Gold 
(1978) 

j uvenile court-selected probationers. 

Under 18 

community volunteers 

Blechman, 
Maurice, Buecker, 
& Heiberg (2000) 

Minors charged with nonviolent misdemeanors or first felonies 
Participant gender was 71.8% male (n =176). Ethnicity was 76.7% 
white (n =188), 17.1% Latino (n =42), and 6.1% black, Asian, 
Native American, and multi-ethnic (n = 15). 

8.85-18.33 

Adult volunteers 

Bernstein etal 
(2009) 

4th-8th graders referred for school failure, low self esteem or lack 
of role models. 82% recieving free/ reduced lunch. 41% african 
american, 23% white, 29% hispanic. 57% female 

4th-8th 

grade 

Volunteers from community 
strategies. 72% female and 
college aged. 

Brooks (1995) 

High school students nominated by teachers based on academic 
performance and extracurricular activities. F rom economically 
disadvantaged schools, primarily African American (89%) and 
female (86%). 

15-18 
years old 

College student volunteers 
female, 19% male, aged 19 
26% White, min. 2.5 G PA) 

Buman &Cain 
(1991) 

Summer Youth Employment Program (SYEP) participants (low 
family income) who had automatically assigned Business Partners 
during the summer of 1986, and whose workplace included access 
to phones and were employed for more than 6 weeks. (Control 
group randomly selected from remaining files, and those who did 
not have Business Partners in subsequentyears). 14-21 yrs old. 

14-21 

Volunteer mentors recruitec 
participating in program to c 
youth whose household inci 
in Minneapolis. 
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Approx. 70% Black, 9% Asian, <1% Hispanic, 11% Native 
American, 8% White: nearly equal male/female. Ave. age 16, 
majority (59%) from mother-only households. 

Cavell & Hughes >84th percentile Aggressive Behaviorscale ofthe Teacher Report Grade 2-3 College undergraduate stuc 

(2000) Form. Primarily African American (48%) and White (37%); and requirements. 

Male (77%). 

Davidson (1976) Local youth contacted by juvenile bureaus and considered in Mean age of College students matched c 

Davidson & jeopardy of juvenile court referral. Mostly white (76%) and males 14.1 years, race. 

Redner(1988) (76%). 

Davidson, 

Seidman, 

Rappaport, Berck, 

Rapp, Rhodes & 

Herring (1977) 

Ku&Blew (1977) 

Seidman, 

Rappaport& 

Davidson (1980) 


Clarke 2009 

Mentees: N = 18, girls and boys. 9th graders identified by teachers 9th Grade 
for behavior problems and risk of failing/ dropping out school 

Mentors: Teachers and othf 
study sites. (12 teachers & 
male & 50% AA. 

Converse & 

Lingugaris/Kraft 

(2009) 

Mentored group 56% white, 44% Hispanic. Control group 40% 
white, 60% Hispanic. 80% male 

Mentoring provided by 13 fc 
white, 11/13 female. 


Davidson (1976) Youth from low income families with prior arrests. Mostly male Mean age- College students matched c 

Davidson & (92%); white (58%) and African American (42%). 14.5 yrs. race. 

Redner(1988) 

Davidson, 

Seidman, 

Rappaport, Berck, 

Rapp, Rhodes & 

Herring (1977) 

Ku&Blew (1977) 

Seidman, 

Rappaport& 

Davidson (1980) 

Davidson, Amdur, J uveniles referred from local juvenile court. Mostly males (84%) Mean age of College students and some 

Mitchell & Redner and White (77%) 14 years 

(1990) 

Davidson & 

Redner (1988) 
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Davis (1988) 

Students repeating 9*' grade. Mostly males (60%); African 
American (46%) and White (54%) 

Mean age of 
15.6 years 

Volunteer teachers and sch 

Dicken, Bryson, & 

Families of elementary school age boys from low-income families. 

6 - 13 years 

College students ; must be. 

Kass (1977) 

Most families were headed by single mothers. All were Caucasian. 

old 

child: demonstrate motivatic 
supervisory sessions. 

Dicken, Bryson, & 

Families of elementary school age girls from low-income families. 

6 - 13 years 

College students : must be. 

Kass (1977) 

Mostfamilies were headed by single mothers. All were Caucasian. 

old 

child: demonstrate motivatic 
supervisory sessions. 

Flaherty (1985) 

Random sample of basic academic level (math and science) 
freshman students. 71% white, 28.5% black, Asian, or other. 52% 
low socioeconomic class, 19% inner city, 16.5% middle class, 7.5% 
mid-high class, 5% high socioeconomic class. 

14-15 

Members of the teaching st 

Fo & O'Donnell 
(1972) 

Fo & O'Donnell 
(1975) 

Fo & O'Donnell 
(1979) 
O'Donnell, 
Lydgate, & Fo 
(1979) 

Youth referred based on behavior and academic problems 
(truancy, poor academic achievement, classroom disruption, 
curfew violation, fighting). Ave. age 14 (7th, 8th grade). Flawaiian, 
Filipino, J apanese, Chinese, and Caucasian. 

11-17 

Adult residents of the comn' 
newspaper ads. Aged 17-61 
ethnically and occupational! 
grade-master's degrees (me 
grade). 

Grant, 2010 

8-6th grade African American boys who were at risk for school 
failure ( G PA lower than 2.0) who were nominated by teachers and 
principles. 

6th -8th 
grade 

Spirituality and knowledge c 
were integrated into group r 
mentoring. 

Grossman & 
Tierney (1998) 
Grossman & 
Rhodes (2002) 
Rhodes, 
Grossman, & 
Resch(2000) 

Majority of boys (62%) and Minority (not specified, 57%) 

10-16 years 
old 

well educated young profes 

Flanlon, Bateman, 
Simon, O'Grady, & 
Carswell (2002) 

Inner-city youth referred as at risk for developing a deviant lifestyle 
and met one or more criteria: alcohol or drugs, history of 
delinquency or other deviant behavior, expulsion from school. 
97.4% black, 2.6% white; 59% male, . 50% referred by family, 26% 
by school, 17% from community agency, 6% by juvenile justice 
system. 2/3 had been arrested before. 

9-17 yo 

Mentoring positions staffed 
from community (young Afri 
students) who were availab 
sessions 4-5 days/wk. after 
Staff/child ratios 1:8 (never 

Flarmon (1995) 

Pregnant and parenting teens and young adults of Flarford County. 
98% female: 48% white, 50% black, 2% other; 80% unemployed: 
42% pregnant 

14-21 yo 

Community volunteers who: 
resemble participant's goals 
role model volunteers 
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Hayes (1998) 

Students identified by their counselors prior to entering 9th grade 
as being "at risk" of dropout; ave. family income in low average 
range, 25% students on free or reduced lunch, 11% absentee rate) 
More males identified as at risk than females. 

grades 9-12 

Volunteer staff members frc 
teachers and support perso 
mentor student for years sti 

Herrera, 

Grossman, Kauh, 
Feldman, 
McMaken & 

J ucovy (2007) 

40 students chosen from 97 9th graders who were completing a 
universal prevention program "Peer Group connection" for high 
school transistion. At risk for academic failure. 

9th graders 

Teacher/ staff at school wh( 
week. 10 mentors for 20 stc 

J ohnson (1997, 
1999) 

At-risk youth based on recommendations from jr. high or high 
schoolteachers/counselors. Half male, half female, 75% black, 
middle-achieving students (B-C range GPAs), qualify for free or 
reduced-price lunch program 

grades 9-12 

Mentors recruited through p 
presentations, TV and radio 
gender (but not race). Initial 
check-ins with staff. Most o 
white, with older children wi 
and work in city. 1/3 of men 
contribution toward student' 
previously involved in anoth 

Karcher, 2008 

5th - 8th graders in large southweestern city . Majority from low 
income families. Majority Mexican American or Hispanic/ anglo 
biracial 

5th - 8th 
grade 

School based mentors were 
5% AA and 6% other. 70% 
Spanish. 73% female. 

Keating (1996) 
Keating, 
Tomishima, 
Foster & 

Alessandri (2002) 

Youth deemed at-risk for juvenile delinquency or mental illness 
(but not involved in serious delinquent behavior). 65% male, 35% 
female: 32% white, 24% black, 37% Latino, 3% Asian, 3% other. 

10-17 yo 

Adults who live in surround! 
in helping troubled youth. M 
for commitment to program 
involvement with at-risk you 
as possible on gender, ethn 
location, and common inten 

Kelley (1973) 

Boys referred from court intake - deemed not serious enough for 
court hearing, but needing intervention. Mean age 14 yrs., 59% 
referred as 1st offense, equal number black/white. 

10-16 yo 

Undergraduate males from 
psychology courses and vol 
requirement. Mean age 27.1 

Kelley, Kiyak, & 
Blak(1979) 

Youth in juvenile court diversion progam. No more than 3 
"unofficial" police contacts: no formal adjudication hearings at 
juvenile court; voluntary admission to the program; no extreme 
disabilities: ages 10-17 yrs. Mean age 14.5 yrs. 78% black, 22% 
white. Equal male, female. 

10-17 yo 

Students from 2 urban colle 
(1/2 4 yr. college, 1/2 comm 
psychology courses and vol 
requirement. 

Kemple & Scott- 
Clayton (2004) 

High school students in a large urban school district. 

14-22 

Employer Partners 

Lattimore, Mihalic, 
Grotpeter, & 
Taggart (1998) 

Youth from low income families receiving public assistance in 5 
industrialized areas. Youth enter program as freshman in high 
school, and program continues through 4 years of high school. 

14-20 

Mentor is "Coordinator": pre 
surrogate parent, role mode 
"Associate" (youth) 

LoSciuto, L., A.K. 

6th graders from low income communities. Primarily African 

11-12 

Volunteers ranging in age ft 
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Rajala, T.N. 
Townsend, and 
A.S. Taylor. (1996) 

American. 


American from low income i 
year commitment. 

SOMaxfield, 
Schirm, & 
Rodriguez-Planas 
(2003) 

Y outh entering 9th grade at a high school with dropout rates > 
40%. Youth were not repeating 9th grade, did not have disabilities 
that would interfere with participation, and had GPA <67th 
percentile. 

14-18 

Mentors were case manage 

Moore & Levine 
(1977) 

Moore (1987) 

Selected by probation officer to be at high risk for re-offending. All 
were white males. 

16-22 yo 

Citizen volunteers matched 
education/vocation, and inte 

Newton (1994) 

Middle school students selected on basis of school failure and 
history of violent behavior. Primarily male (73%) and African 
American (73%). 

grades 7-8 

College students: primarily i 
American (67%). 

Polit, Kahn & 
Stevens (1985) 
Quint (1991) 

Primarily low-income African American and Latino women who we 
pregnant or parenting at the time of study enrollment. 

14-17 

Volunteer women from low- 
Range in age from 20s to 7i 
diploma, but not working. N 
teens. 

Reyes &J ason 
(1991) 

Ninth grade students from a large urban school with a high (60%) 
dropout rate. Primarily Hispanic. 

9th grade 

Homeroom teachers - trainc 
counseling. 

Rowland (1992) 

Identified as high-risk of dropping out of school before graduation. 

grades 1-5 

Area business men and woi 
retirees, and civic members 

Royse (1998) 

African American teenagers, ages 14-16 from female-headed 
household and less than grade equivalency in reading, math, and 
science. Live in household with income at or below 125 % federal 
poverty guidelines. 

14-16 years 

African American male com 
college graduates in their 3i 

Schinke, Cole, & 
Poulin (2000) 

40% female, ave. age 12.3 yrs., 63% black, 19% Hispanic, 13% 
white, 5% Asian and other. 

12.3 avg 

Boys and C iris Club of Arne 
and other volunteers 

Watson (1996) 

Hispanic middle school and high school students identified as "at- 
risk" at least one of the characteristics: 1. retained at least one 
grade, 2. 2 or more yrs below grade level in standardized tests, 
3.failed at least 2 courses, 4. failed at least one section of the 
statewide standardized test. 

middle/high 

school 

Senior citizen and college s 
throughout community. 

McCord (1978, 

Boys from densely-populated urban industrial areas identified by 

5-13 yrs. 

Social workers who tried to 

1979) 

schools, welfare agencies, churches and police as "difficult" or 
"average", given physical exams, and then matched in pairs on 
age, delinquency-prone histories, family background, and home 
environments (coin toss determined group). 

Original 
study: 35-44 
for follow up 

relationship with boy and he 
in variety ofways. Counseic 
with criminal justice agencic 
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Table 3 (third section) details of 46 studies included in Meta Analysis (cont'd) 


Citation 

Description of Mentoring 

Additional Interventions 

Abbott, Meredith, Self- 
Kelly, & Davis (1997) 

Big Brothers/Big Sisters of America. Adult companion program; weekly 
companionship between boy and adult male for 12-18 months for a visit 
and/or activity. Big Brother to serve as positive role model to child in 
vocational, psychological, and social ways. 

None 

Aiello (1988) 

take part in a series of structured and unstructured activities throughout 
the years. Bimonthly meetings between mentors and mentees. 

None 

Anderson (1977) 

One to One program - volunteers spent at least 2 hrs. per week and 
were looked at role models, friend, or assistant 

Family Crisis Intervention: Servi 
(children refusing to go home w 
sessions: at publication, data nc 

Aseltine, Dupre, & 
Lamlein (2000) 

Mentors spend atleast2 hours/week in one on one contactwith youth. 
Activities include tutoring, community service, recreational activities, 
and assistance with school projects. 

Community Service (youth spen 
Competence Training (26 weeki 
management, self-esteem, etc), 
weekend events for youth, their 

Barnoski (2002) 

Meet monthly during last 5-6 months of youth confinement in] uvenile 
facility 

None 

Berger & Gold (1978) 

One on one similar to Big Brothers/Big Sisters 

Some (number notspecified) ch 
counseling or tutoring. 

Bernstein etal (2009) 

103 schools participating in federally funded examination of the 
effectiveness of school based mentoring. The programs focused on 
academic goals, selfesteem, relationship building, and giving advice. 

None 

Blechman, Maurice, 
Buecker, & Heiberg 
(2000) 

Adult volunteers who spent 2 hours a week for approximately 21 weeks 
with proteges 

Mentors attended a training program 

None, all participants received] 
Study compared] D to] D-FMeni 

Brooks (1995) 

take part in a series of structured and unstructured activities throughout 
the years. Bimonthly meetings between mentors and mentees. 

None 

Buman &Cain (1991) 

Volunteer mentors commit to meet Youth Partners at youths' worksites, 
contact them by phone once/week to discuss work issues, accompany 
them to work sponsored events. 

none 

Cavell & Hughes (2000) 

"Therapeutic" mentors received 18 hours of training. Mentor visits were 
at least 1 hour per week outside of school hours for 16 months of 
intervention. Goal of providing accurate understanding, emotional 
acceptance, and firm limits on antisocial behaviors. Engaged in 
interactive activities. 

Treatment group received thera 
trained and supervised), teache 
consultation, and problem-solvir 
received "standard" (i.e., untrair 
mentoring. 

Clarke, 2009 

"Achievement Mentoring" - took palce during the second semseter of 
the ninth grade. Adaptation of "Behavioral Monitoring and reinforcement 
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program). Mentors spoke with teachers, meet with mentee for 20 
minutes and follow up on achievement and goals. 


Converse & 
Lingugaris/Kraft(2009) 

Mentoring occurred over 18 weeks, for an average of 15 meetings. 
Involved relationship building, support and academic 


Davidson (1976) 
Davidson & Redner 
(1988) 

Davidson, Seidman, 
Rappaport, Berck, 

Rapp, Rhodes & Herring 
(1977) 

Ku&Blew (1977) 
Seidman, Rappaport& 
Davidson (1980) 

Relationship building, behavioral contracting, and child advocacy. 

Community Advocacy (targeting 

Davidson (1976) 
Davidson & Redner 
(1988) 

Davidson, Seidman, 
Rappaport, Berck, 

Rapp, Rhodes & Herring 
(1977) 

Ku&Blew (1977) 
Seidman, Rappaport& 
Davidson (1980) 

Relationship building and child advocacy. 

Community Advocacy (targeting 

Davidson, Amdur, 
Mitchell & Redner 
(1990) 

Davidson & Redner 
(1988) 

Relationship building, behavioral contracting, and child advocacy. 

None 

Davis (1988) 

Relationship building, support, attendance and academic monitoring. 

None 

Dicken, Bryson, & Kass 
(1977) 

companionship program, 2 visits and 6 hrs. of contact per week during 
an academic semester in a variety of settings. 

none 

Dicken, Bryson, & Kass 
(1977) 

companionship program, 2 visits and 6 hrs. of contact per week during 
an academic semester in a variety of settings. 

none 

Flaherty (1985) 

Members of the teaching staff served as advocating adults for mentees. 

None 

Fo& O'Donnell (1972) 
Fo& O'Donnell (1975) 
Fo& O'Donnell (1979) 
O'Donnell, Lydgate, & 
Fo(1979) 

Adult buddies attempted to influence youth through their relationship 
and contingent use of social and material reinforcement. Buddies paid 
$144/month by earning points for training, contact, and documentation. 

contingent material reinforceme 
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Grant, 2010 

Christian African American community based mentoring. Offers peer 
group involvement, skill development, knowledge of African American 
culture and mentoring. 

Spirituality and knowledge of Af 
integrated into group meetings i 

Grossman & Tierney 
(1998) 

Grossman & Rhodes 
(2002) 

Rhodes, Grossman, & 
Resch(2000) 

Big Brother Big Sisters program (BBBS), 3-4 hr. meetings 2-4 times per 
month for at least one year. 

none 

Hanlon, Bateman, 
Simon, O'Grady, & 
Carswell (2002) 

Group mentoring session 4-5 days/week after school. Homework help, 
regularly scheduled activities and presentations, holiday parties, field 
trips. 

All subjects received individual ( 
experimental clinic were trained 
strategies, were provided suppo 
resources. Counselors also led 
parenting and led program-spor 
events. Subjects in the experimi 
remedial education. 

Harmon (1995) 

Goal to provide opportunity for youth to bond with prosocial others, 
increase self-esteem, life management, and employability skills, and 
decrease favorable attitudes toward drug use. 

Drug education, monthly career 
workshops, "Bright Futures" cur 
(using worksheets, discussion, i 
from self-esteem to drug abuse 
Training (after 80% completion i 
includes weekend retreat). 

Hayes (1998) 

Staff met 4 times for 1 hour during 1st year to receive training in at-risk 
student behavior. Mentors to spend as much time with mentee as they 
feel comfortable. Mentors provided support and guidance to their 
student mentees by placing emphasis on interpersonal relationships, 
problem solving techniques, communication skills, positive behavior, 
study skills. 

None 

Herrera, Grossman, 
Kauh, Feldman, 
McMaken &J ucovy 
(2007) 

School based mentoring program through the use of Big Brothers Big 
Sisters mentoring program. Meet once a week at school with mentor 
during or after school from 30-60 minutes. Completed social and 
academic activities as pairs and with groups of mentor matches. 

Tutoring with mentor 

Holt, Bry &J ohnson 
(2008) 

"Achievement Mentoring" - took place during the second semester of 
the ninth grade. Adaptation of "Behavioral Monitoring and reinforcement 
program). Mentors spoke with teachers, meet with mentee for 20 
minutes and follow up on achievement and goals. 


Johnson (1997, 1999) 

Mentors meet with mentees at least once monthly, with phone calls in 
between meeting times. Provide assistance in college and financial aid 
application process, attend SAS outings, monitor student's grades, and 
report on relationship's progress with SAS program staff. 

None 
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Karcher(2008) 

8 meetings with school-based mentor. Part of larger multii-component 
school based intervention. Mentors were 54% latino, 3% Caucasian, 5% 
AA and 6% other. 70% college students. 43% spoke Spanish. 73% 
female. 

Both experimental and compari! 
services through community ag( 
enhancement activities, guidanc 
tutoring. 

Keating (1996) 
Keating, Tomishima, 
Foster &Alessandri 
(2002) 

Youth and adults spend a minimum of 3 hrs. in activities such as going 
to sporting event, the movies, or a park. 

Life skills training - a monthly se 
professionals on topics such as: 
and alcohol abuse, cross culturj 
and school problems. 

Kelley (1973) 

Ultimate goal for each student counselor was to establish with his 
juvenile companion, "corrective counseling relationship." 1:1 mentors, 3- 
8 months (ave. 5.6 months), 4 times/month, less than 3 hrs. each 
meeting. 

None 

Kelley, Kiyak, &Blak 
(1979) 

Meetings weekly for a minimum of 4 hrs. 

None 

Kemple & Scott-Clayton 
(2004) 

Interpersonal support. 

Implemented Career Academie: 
organization and that also provii 

Lattimore, Mihalic, 
Grotpeter, & Taggart 
(1998) 

"Coordinator", or mentor, coordinates the program for youth partner. 
250 hrs. educational activities (computer-assisted instruction, peer 
tutoring): 250 hrs. development activities (cultural activities, acquiring 
life/family skills, college and/or occupational training): 250 hrs. service 
activities (community service projects, helping with public events, work 
as volunteer for various agencies). 

Education activities (e.g., peert 
instruction), development activit 
job preparation), service activitit 
volunteering). Financial Incentii 

LoSciuto, L., A.K. 
Rajala, T.N. Townsend, 
and A. S. Taylor. (1996) 

Spent a minimum of 4 hours together each week, engaging in a variety 
of activities (.e.g. helping with homework, attending class field trips, 
attending cultural/sporting events). 

None. T reatment condition cons 
community service, classroom-t 
parent workshops. Control groi 
classroom-based life skills curri( 
interventions without mentoring. 

SOMaxfield, Schirm, & 

Rodriguez-Planas 

(2003) 

No description of specific mentoring activities other than mentoring 
being a component of the case management. Noted that case 
managers developed "deep personal relationships" with 40 - 60 percent 
of students at some sites. 

Case management, target of 25 
components - education, develo 
community service. Financial im 

Moore & Levine (1977) 
Moore (1987) 

Weekly meetings between "citizen counselors" and subjects. 

None - probation programs fora 
counseling only to treatment grc 

Newton (1994) 

Each mentor met weekly with 1-2 students during 1 semester. Provided 
academic assistance, worked with teachers to establish behavioral 
goals, and served as positive role models. 

None 

Polit, Kahn & Stevens 
(1985) 

Quint (1991) 

Served as confidantes, escorted to appointments, recreational events, 
made reminder calls, and acted as paraprofessional case managers. 

Informational workshops, links t 
counseling. 
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Reyes &J ason (1991) 

Guidance and counseling by homeroom teachers. 

Redesign of school day to keep 
together (3 core classes). Feed 
weeks. 

Rowland (1992) 

Mentors met with high-risk students for min. of 1 hr./wk. for school year. 

None 

Royse (1998) 

No details on content of mentoring. Also Included monthly group 
outings. 

None 

Schinke, Cole, & Poulin 
(2000) 

Discussions with adults. 

Weekly structured activities of tl 
program. 

Watson (1996) 

Four distinct mentoring treatments: (1) Mentor called student 2x/wk. (2) 
Student Instructed to call mentor 2x/wk. (3) Mentor met with group of 5 
students 2x/wk. (4) Mentor met with student 2x/wk. 

none 

McCord (1978, 1979) 

5 year treatment: counselors assigned to each family visited ave. twice 
a month. 

For treatment group, 1/3 focuse 
tutored In academic subjects, 1/ 
psychiatric attention, 1/4 sent to 
Into Boy Scouts, YMCA, or simll 
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Table 4 


Standardized Mean Difference Effect Sizes and Homogeneity Statistics from Random Effects Mentoring Meta-Anaiyses 

Model 

SMD 

95% Cl 

Z 


|2 

H 

Delinquency (k = 25 studies) 

0.21 

0.17-0.25 

9.84** 

0.01 

99.3% 

11.72 

Aggression (k = 7 studies) 

0.29 

-0.04-0.62 

1.71 

0.18 

95.4% 

4.66 

Drug Use (k = 6 studies) 

0.16 

-0.00-0.32 

1.93 

0.04 

99.9% 

27.01 

Academic Achievement (k = 25 studies) 

0.11 

0.07-0.15 

5.86** 

0.01 

60.0% 

1.58 

Overall Effects (k =46 studies) 

0.18 

0.15-0.21 

10.80** 

0.01 

99.2% 

11.50 

Note: *p <.05, <.01 
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Tables 


Differences in Mean Effect Sizes by Study Design 






Quasi-experimental 

Designs 

Randomized Controlled 
Trials 

Meta-Regression 


#Studies SMD 

#Studies 

SMD 

B 

SE 

Delinquency 

11 0.20 

14 

0.42 

.19 

.16 


3 0.14 

4 

0.41 

.26 

.29 

Aggression 


3 0.19 

3 

0.13 

-.07 

.11 

Drug Use 


15 0.14 

10 

0.22 

.19 

.16 

Academic Achievement 


Table 6 
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Moderation of Mentoring Effects (Random Effects Modeis) 

Levei of Moderator Meta- 

Regression 


Absent Present 


Moderator 

k 

SMD 

L 

u 

k 

SMD 

L 

U 

B 

SE 

Mentee Selection 

Individual Risk 

22 

0.20 

0.05 

0.35 

16 

0.23 

0.11 

0.35 

0.03 

0.09 

Environmental Risk 

28 

0.20 

0.09 

0.31 

8 

0.23 

-0.06 

0.51 

0.03 

0.12 

Other Interventions 

23 

0.20 

0.06 

0.34 

23 

0.31 

0.13 

0.49 

0.07 

0.10 

Motivations of Mentors 

Civic Duty 

11 

0.24 

0.00 

0.47 

21 

0.22 

0.09 

0.35 

0.02 

0.11 

Professional Development 

20 

0.16 

0.05 

0.27 

16 

0.42 

0.16 

0.68 

0.2P 

0.11 

Quality and Fidelity Checks 

Quality Check 

14 

0.20 

0.06 

0.35 

20 

0.21 

0.05 

0.38 

-0.00 

0.10 

Fidelity Check 

27 

0.20 

0.09 

0.30 

6 

0.29 

-0.15 

0.73 

0.05 

0.14 

Key Processes 

Modeling/ldentification 

28 

0.24 

0.08 

0.40 

11 

0.32 

0.08 

0.56 

0.06 

0.12 

Emotional Support 

12 

0.11 

0.00 

0.23 

27 

0.32 

0.14 

0.50 

0.22* 

0.12 

Teaching 

11 

0.12 

-0.01 

0.24 

30 

0.29 

0.15 

0.44 

0.15 

0.10 

Advocacy 

32 

0.13 

-0.05 

0.31 

10 

0.39 

0.06 

0.72 

0.17* 

0.09 


Notes: * p <.05, one-tailed 

Random effects models of standardized mean differences (SMD) are the sources of the significance tests forthe SMDs within 
levels of each moderator. The meta-regression models are mixed effects models using full maximum likelihood estimation, 
k = number of studies, SMD = standardized mean difference, L = lower limit of the 95% confidence interval for the SMD, U = 
upper limit of the 95% confidence interval for the S M D 
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Table 7: Qtations for Excluded Studies 


Excluded Studies (N = 118) 

Abcug(1991) 

Ahrens, Richardson ,Lorazno, DuBois (2007) 

Baldwyn Separate School District, MS. (1982). 

Banta & Lawson (1980) 

Barron-McKeagney, Woody, & D'Souza (2001) 

Beier, Rosenfeld, Spitalny, Zansky, & Bontempo (2000) 

Bellamy, Springer, Sale, & Espiritu, (2004) 

Bernstein, Rappaport, Olsho, Hunt & Levin (2009) 

Bilbrew (2009) 

Blakely, Menon, &J ones (1995) 

Blinn-Pike, Kuschel, McDaniel, Mingus, & Mutti (1998). 

Bracy (2008) 

Bruce & Mueller (1994) 

Campos, Phinney, Perez-Brena, Kim, Ornelas, Nemanim, Padilla, Mihecoby & 
Ramirez (2009) 

Carrington, Tymms, & Merrell (2008) 

Cave & Quint (1990) 

Cheng, Haynie, Brenner, Wright, Cung & Simons-Morton (2008) 

Ching, Yeh, Siu, Wu & Okubo (2009) 

Clarke (2009) 

Colley (2003) 

Colley (2003) 

Colson, Godsey, Mayfield, Nash, & Borman (1978) 

Conduct problems prevention research group (2007) 

Conduct problems prevention research group (2010) 

Cummings (2010) 

Dance (2001). 

Dappen & Isernhagen (2002). 
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Davis, & Haney (2003) 

Davison (1994) 

De Blank (2009) 

De Wit, Lipma, Manzano-Munguia, Bisanz, Graham, Offord, O'Neill, Pepler& 
Shaver (2007) 

DuBois SiSilverthorn (2005). 

Elledge, Cavell, Ogle & Newgent (2010) 

Frazier, Richards, & Potter (1981) - 2 studies 
Galvin (1989) 

Garate-Serafini, Balcazar, Keys, & Weitlauf (2001) 

Gearing (2008) 

George (1986) 

Goldner& Mayseless (2009) 

Gordon, Iwamoto, Ward, Potts & Boyd (2009) 

Goodman (1972) 

Graber(1985) 

Grant (2010) 

Green (1979) 

Green (2010) 

Guetzloe (1997) 

Hanlon, Simon, O'Grady, Carswell & Callaman (2009) 

Hart, O'Toole, Price-Sharps & Shaffer (2007) 

Hayward &Tallmadge (1995) 

Heard (1990) 

Hernandez (2009) 

Herrera, Sipe, & McClanahan (2000) 

Herrera, Grossman, Kauh & McMaken (2011) 

Hill (1972) 

Hines (1988) 

Howitt, Moore, &Gaulier (1998) 

Huisman (1992) 
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J entSi Niec (2006) 

J ohnson (2009) 

J ohnson, Holt& Powell (2008) 

J oseph (1992) 

Karcher(2008) 

Keenan (1992) 

King, VIdourek, Davis, & McClellan (2002) 

Klaw, Rhodes, & Fitzgerald (2003) 

Komro, Flay, & BIglan (2011) 

Lakes (1997) 

Lamb (2010) 

Lampley &J ohnson (2006) 

Laughrey (1990) 

Lee, Pllonis, & Luppino (1989) 

Martin (2008) 

McGreevy (2007) 

McPartland & Nettles (1991) 

Mecartney (1994) 

Mertens (1988) 

Mitchell, &Casto (1988) 

Morley, Rossman, KopczynskI, Buck, &Gouvls (2000) 
Nelson, & Valllant(1993) 

New York City Board of Education (1986) 

Pace (2010) 

Pagan & Edwards-Wllson (2003) 

Pedersen, Woolum, Gagne & Coleman (2009) 

Posted (2008) 

Powers &McConner(1997) 

Powers, Sowers, & Stevens (1995) 

Reglln(1997) 
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Reller(1987) 

Rhoden-Trader(1998) 

Rhodes, Haight, & Briggs (1999) 

Rippner(1992) 

Roberts, & Cotton (1994) 

Rockwell (1997) 

Rollin, Kaiser-Ulrey, Potts, & Creason (2003) - 2 studies 
Rosenblum, Magura, Fong, Curry, Norwood &Casella (2006) 

Roussos (2002) 

Rulison (2010) 

Sale, Bellamy, Springer, Wang (2008) 

Schmidt, McVaugh &J acobi (2007) 

Schwartz, Rhodes, Chan & Herrera (2011) 

Schobitz (2004) 

Seidle(1982) 

Slicker& Palmer(1993) 

Slough, McMahon &ConductProblems Prevention Group (2008) 

Smith (1990). 

Smith, Leve & Chamberline (2011) 

Stanwyck & Anson (1989) 

Sterba (2001) 

Struchen & Porta (1997) 

Tebes, Feinn, Vanderploeg, Chinman, Shepard, Brabham, Genovese, Connell 
(2007) 

Tierney, Grossman, & Resch (1995) 

Turner & Scherman (1996) 

Valenzuela-Smith (1984) 

Welkowitz & Fox (2000) 

Wunrow & Einspruch (2001) 

Wyman, Cross, Brown, Yu, Tu & Eberly (2010) 

Wyatt (2009) 
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Zand, Thomson, Cervantes, Espiritu, Klagholz, LaBlanc & Taylor (2009) 
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9 Figure Captions 


Figures 1-4 

Forest plots of meta- analysis of the effects of mentoring interventions for each 
outcome. 

Figure 1 reports studies measuring outcomes related to delinguent involvement. 

Figure 2 reports effects related to academic achievement. 

Figure 3 reports effects on aggression or externalizing behaviors. 

Figure 4 reports effects on illegal drug use. The size of the center sguare shows the 
weight assigned to the study and the width of the error bars shows the 95% 
confidence interval for the effect size of each study. 


Figures 5-6 

Plots of average overall standardized mean difference (SMD) effect sizes and 95% 
confidence intervals by levels of moderating variables. 

Figure 5 graphs moderation of overall effects by two possible motivations of 
mentors, dvic duty and professional development. 

Figure 6 graphs the overall effect estimates by the presence or absence of key 
processes in the mentoring intervention, including emotional support, promotion of 
modeling or identification with the mentor, and teaching. 
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Figure 1 
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Figure 2 


Aggression 


Holtetal.(2008) 
/stobottetal.(1997) 
Cavell a Hughes (2000) 
Dickenetal.(1977)-A 
Dicken et al . (1 977) - B 
Keating etal. (2002) 
NeWon(1994) 


SMD = .29 
Q= 130.35 
F = 95.4% 
r^= .18 


Rarujorn Effects Estimate »> 




-3 


- 2-10 1 

Standardised mean difference 
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Figure 3 


Drug Use 
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Figure 4 


Academic Achievement 
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Figure 5 



Figure 6 
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□ Absent 
0 Present 


10 Appendices 


Appendix 1: Lipsey and Wilson (1998) Codebook 
Appendix 2: DuBois et al. (2002) Code Sheet 
Appendix 3: Tolan et al. (2004) additional coding 
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Appendix h J uvenile Delinquency 
Meta-Analysis Coding Manual 


REVISED EUGIBILITY CRITERIA EOR INCLUSION OE A STUDY 
IN THE DELINQUENCY META- ANALYSIS 

1. The study must investigate the effects of an intervention or treatment, broadly 
defined. In addition to therapeutic type treatments, eligible interventions can 
include such modalities as incarceration, probation, systems intervention, and the 
like. Note that the intervention need not explicitly aim to reduce or prevent 
delinquency. Eor example, a program to teach delinquents to read would qualify if it 
met all other criteria even though it was presented as an academic improvement 
program rather than a delinquency reduction program. The following interventions, 
however, are specifically excluded: (a) treatments targeted exclusively on substance 
abuse without attention to any other components of antisocial behavior or outcome 
variables representing delinquency other than substance use violations; (b) 
pharmaceutical or medical treatments without significant psychosocial components, 
e.g., drugs, diet, cosmetic surgery, and the like. 

2. The intervention must be applied to a sample that includes juvenile offenders. An 
offender is defined as a person apprehended by the police, involved with the juvenile 
or criminal justice system, or identified as having engaged in behavior chargeable 
under applicable laws, vdiether or not apprehended or charged. Chargeable offenses 
include "status" offenses (runaway, truancy, curfew violations, incorrigible, out of 
parental control) and actions in school and other such contexts that are 
interpretable as chargeable offenses even if not presented as delinquent behavior, 
e.g., fighting (assault), damaging school property (vandalism), and the like. A 
juvenile is defined as anyone under the age of 21 (i.e., age 20.9 or under) . If both 
juveniles and adults are included in the treatment sample, the study is acceptable if 
the study reports the juvenile results separately or juveniles constitute a majority of 
the subjects for whom results are reported. Note that if there are any clearly 
identified juvenile offenders under these definitions in the treatment sample (even 
one), this eligibility criterion is met. 

3. The study must measure at least one quantitative delinquency outcome variable. 
In addition, it must report results on at least one such a variable in a form that, at 
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minimum, allows the direction of the effect to be determined (whether the outcome 
was more favorable for the treatment or control group) . If a delinquency outcome is 
measured but the reported results fall short of this standard, the study will still be 
acceptable if the required results can be obtained from the author or other sources. 

A delinquency outcome variable is one that represents, at least in part, the subject's 
involvement in behavior that constitutes chargeable offenses as defined in 2 above. 

4. The study design must involve a comparison that contrasts one or more 
identifiable focal treatments with one or more control conditions. Control conditions 
can be "no treatment," "treatment as usual," "placebo treatment," and so forth as 
long as they do not represent a concerted effort to produce change. Thus, treatment- 
treatment comparisons are not eligible unless one of the "treatments" is explicitly 
presented as a form of control condition, e.g., a "straw man" treatment not expected 
to be effective. When different naturally occurring facilities or groups (e.g., court or 
probation dispositions) are compared, the study will be eligible only if the different 
groups are presented as a contrast between a program or intervention of special 
interest and a control (e.g., "treatment as usual"). For example, a comparison of the 
pre and post arrest rates for juveniles in each of several probation camps would not 
be eligible unless it was explicitly presented as a contrast between camps with 
distinctive programming, e.g., "milieu therapy," and others that followed relatively 
indistinctive routine and customary practices. 

Random assignment designs that meet the above conditions are always eligible 
under this criterion. One-group pretest- posttest studies are never eligible (studies in 
which the effects of treatment are examined by comparing measures taken before 
treatment with measures taken after treatment on a single subject sample). Non- 
equivalent comparison group designs may be eligible (studies in which treatment 
and control groups are compared even though the research subjects were not 
randomly assigned to those groups) . To be eligible, however, such comparisons must 
have either (a) matching of the treatment and control groups prior to treatment on 
at least one recognized risk variable for delinquency such as prior delinquency- 
history, sex, age, ethnicity, or socioeconomic status; (b) a pre- intervention measure 
(pretest) for at least one delinciuency outcome variable on which the treatment and 
control groups can be compared; or (c) a pre- intervention measure on at least one 
recognized risk variable for delinciuency (as above) on which the treatment and 
control groups can be compared. Note that the pre- intervention measures need not 
show that the treatment and control groups are actually similar, only be capable of 
showing their degree of similarity (or dissimilarity) . 

5. The study must be set in the U.S. or a predominately English-speaking country 
and use juveniles resident to that country. Note that the juveniles need not be 
English-speaking or "An^o." A study conducted in the U.S. or Canada with resident 
Hispanic juveniles, for example, would ciualify. In addition, the study must be 
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reported in En^ish; studies reported in another language will be excluded 
irrespective of where they were conducted or the nationality of the juveniles. 

6. The date of publication or reporting of the study must be 1950 or later even 
though the research itself might have been conducted prior to 1950. If, however, 
there is evidence in the report that the intervention under study was applied to the 
research sample prior to 1945 (i.e., more than five years before the 1950 cutoff date), 
then the study should be excluded. 


EUGIBIUTY CHECBCLIST 


No Yes 


Involves a "treatment, " broadly defined, that can be viewed as 
potentially having some practical benefit for juvenile or society; not 
restricted to a treatment of solely theoretical interest. 

Involves a comparison that contrasts one or more identifiable focal 
treatments with one or more control conditions. 

Subjects assigned randomly, matched, or pre- treatment group 
equivalence available? 

Quantitative outcome data or direction of effect available on at least 
one delinquency outcome measure. 

Involves juvenile delinquents or subjects committing acts which 
constitute chargeable offenses. 

Subj ects are under the age of 2 1. 

Study is set in an English-speaking country and reported in En^ish. 
Date of publication is 1950 or later. 
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STUDY HEADER AND EXPERIMENTAL COMPARISONS 


Definition of a study 

The "unit" to be coded consists of a study, i.e., one research investigation of defined 
subject samples compared to each other and the treatments, measures, and 
statistical analyses applied to them. Sometimes there are several different reports of 
a single study. In such cases, the coding should be done from the set of relevant 
reports, using vdiichever is best for each item to be coded; be sure you have the full 
set of relevant reports before beginning to code. Sometimes a sin^e report describes 
more than one study, e.g., a series of similar studies done at different sites. In these 
cases, each study should be coded separately as if each had been described in a 
separate report. 


Study and Coder Identification 

[Note: Variable names for SPSS in brackets, e.g., [ID]; these are not shown in 
FileMaker and can be ignored for coding purposes.] 

Identification number of primary report as assigned in the master 

bibliography [ID]. 

/ / Date coded [CodeDate] 

Coder's initials (3 letters) [Coder] 
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CONTEXT SCREEN 


Type of publication [SH2] (if multiple, code highest in list; e.g., if dissertation 
and journal article, code study as journal article). 

1 book 

2 j oumal article/ book chapter 

3 thesis/ dissertation 

4 technical report 

5 conference paper 

6 other: 

Year of publication [ SH3] (two digits; estimate if necessaiy) . If you have multiple 
reports enter the year that corresponds to the report you selected under 'type of 
publication' above. If there are multiple reports of the same type, use the earliest 
date. [Eligibility issue- not before 1950] 

Senior author's discipline [ SH5] (check best one) : Note that this guestion asks 
about the senior author - thus, if more than one author, use discipline of first 
author. 

01 psychology 

02 sociology 

03 education 

04 criminal justice; criminology 

05 social work 

06 psychiatry; medicine 

07 political science 

08 anthropology 

09 other: 

10 cannot tell 

Country in which study conducted [SH6] 

[Eligibility issue- should be English speaking culture] 

1 USA 

2 Canada 

3 Britain 

4 other Commonwealth/ English speaking 

5 other 

6 cannot tell 
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Role of evaluator/ author in the program [ SH9] (if more than one, check the 
highest on the list) : [Note: This item is focusing on the role of the research team 
working on the evaluation regardless of whether th^ are all listed as authors.] 

1 Evaluator delivered therapy/ treatment 

2 Evaluator involved in planning, controlling, or supervising delivery treatment 
or Evaluator is designer of program 

3 Evaluator influential in service setting but no direct role in delivering, 
controlling, or supervision 

4 Evaluator independent of service setting and treatment; research role only 

5 cannot tell 

Program age at time of research [SHIO] (check best judgment): [Note: If 
several treatments of different sorts, answer in terms of the treatment to be used in 
the aggregate experimental comparison, next section. If organization predates 
treatment, respond in terms of how new treatment is if can assess; if not, indicate 
how new organization is if can assess. This item is attempting to distinguish between 
inexperienced, formative, immature programs and those that have been refined and 
are more mature.] 

1 relatively new, e.g., less than two years old or first of relatively few client 
cohorts 

2 established program, in place two years or more, or many client cohorts 

3 defunct program, evaluated post hoc 

4 cannot tell 

Program sponsorship [SHll] (check best one): [Note: Who administers and 
"owns" the program irrespective of where housed. This is a question of ^^ho makes 
decisions like staffing, changing the program, etc. The first two categories are 
basically for research and demonstration programs organized by researchers 
primarily for research purposes. Usually the last three categories are the appropriate 
choices if the work is done in a service agency even if for research purposes.] 

1 demonstration program/ treatment administered by researchers for one 
treatment cohort only 

2 demonstration program/ treatment run by researchers for multiple treatment 
cohorts 

3 independent "private" program with own facility, staff, etc. (e.g., YMCA, private 
agency, university clinic) 

4 public program, non criminal justice sponsorship (e.g., school sponsored, 
community mental health, department of social services) 

5 public program, criminal justice sponsorship (e.g., police, probation, courts) 

6 cannot tell 


GROUPS SCREEN 


94 The Campbell Collaboration | www.campbellcollaboration.org 




Experimental Comparisons Worksheet 

Step 1 Identify all group comparisons in the study. A comparison consists of a 
configuration in which group differences are or could be tested with t- tests, F- tests. 
Chi-squares, etc. applied to various dependent measures. Your concern now is with 
the group comparisons, not the number or nature of dependent measures on which 
they may be compared (that comes later) . For example, one treatment group 
compared with one control group on six dependent measures is one experimental 
comparison. The full range of interesting variation on experimental comparisons 
expected in studies includes the following three possibilities: 

(a) Aggregate treatment and control groups. The largest subject groupings on which 
contrasts between experimental conditions can be made. Often there is only one 
aggregate treatment group and one aggregate control group, but it is possible to have 
a design with numerous treatment variations (e.g., different levels) and control 
variations (e.g., placebos) all compared (e.g., in ANOVA format) . These are the 
groups you will identify on the GROUPS screen. 

Step 2: Write in the name^ description of each aggregate treatment group and each 
aggregate control group in the appropriate boxes and, underneath, the number 
(count) of such groups. 

[ SH24] : Total number of treatment groups from this study. 

[SH25]: Total number of control groups from this study. 

Step 3: You will code only one aggregate treatment vs. control comparison plus 
selected breakouts and post- treatment follow-ups. If there is more than one 
aggregate treatment group and/ or more than one aggregate control group, a 
selection of which pairing to code must be made as follows: 

(a) More than one aggregate treatment group. First, determine if the various 
treatments are sufficiently similar to combine. This requires that treatment be 
virtually the same, at least by generic label, for each group, e.g., groups with the 
same treatment but implemented at different sites or stratified into subgroups that 
can be recombined into a sensible whole. In such cases, combine the treatment 
groups into a composite whole if appropriate statistics are available (note: an Excel 
calculator called "group combo" is available to do the required computations for this 
in some cases) . If statistics for combination are unavailable, select one treatment 
group to code, as indicated below, and drop the others. Note that if each treatment 
group has its own distinct control group, separate studies are constituted requiring 
that each treatment- control pair be coded as independent studies. 

If the treatments are distinct, e.g., deliberate experimental variations, and cannot be 
combined into a relatively uniform composite, then one must be selected as follows: 
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• If one treatment is clearly the focal concern of the study, with others serving as 
examples of more conventional approaches, etc., then select the focal treatment. 

• If the treatments are parametric variations, e.g., counseling with and without 
advocacy, then select the most complete or extensive treatment, e.g., the 
counseling with advocacy. Extensive refers to breadth of services not number of 
hours of service. This is a subset/ superset issue. If one treatment is a subset of 
another, in the sense of having some but not all of the treatment elements of the 
other take the superset as the treatment group of interest. 

• If the treatments are different, of egual interest to the study, and of egual 
completeness, then select the one with the largest N. If egual N, select the one 
that is least imusual and if egual in that regard, make a random choice (coin 
toss). 

(b) More than one aggregate control group, e.g, attention placebo, no control, etc. 
Select the best control group available to code from the rank order listing below 
(best listed first): 

1) "no treatment" control (control gets no treatment, left alone) 

2) placebo control (controls get some attention or sham treatment) 

3) treatment as usual control (controls get "usual," handling instead of special 
treatment, e.g, regular probation or school) 

4) "straw man" alternate treatment control not expected to be effective but used as 
contrast for treatment group of primary interest 

If there are multiple groups in any of these categories, combine them if possible and 
sensible; otherwise, choose the one aimed at the group most similar to the group 
receiving the treatment of interest. If you still can't choose on this basis, randomly 
select one group as the control. 

If there are no control groups in these categories, i.e., an uncontrolled study or one 
comparing alternate treatments to each other but not to a control, the study is 
ineligible for coding. Be careful, however, not to confuse "treatment as usual" 
controls, which are eligible, with 'treatment- treatment" comparisons, which are 
not eligible. If a treatment is a deliberately designed as an "add on" to the conditions 
the juveniles otherwise experience, then it cannot be considered a control. 

Treatment as usual is the normal or usual condition of the juveniles at issue. For 
example, in a study of treatment of probationers, the "usual" treatment is normal 
probation. Comparison of juveniles on normal probation with those receiving special 
intensive supervision, extra counseling, or the like would be an eligible study. Also, 
do not confuse a placebo treatment, vdiich is eligible, with an "alternate treatment" 
comparison. A placebo treatment is deliberately set up for the purpose of making a 
particular contrast with treatment, i.e., it has certain characteristics of treatment but 
lacks the presumed critical ingredient. Alternate treatments, by contrast, are 
legitimate treatments in their own right, not defined in terms of their role as a 
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contrast for the focal treatment of interest. Sometimes an alternate treatment is used 
for comparison with no expectation that it will be effective, i.e., it is a "straw man" 
treatment perceived ineffective and included for contrast with an identifiable focal 
treatment of primary interest. In such cases, the alternate treatment control would 
be eligible- it is virtually a placebo condition. 

Reminder: If there are multiple treatments, each paired with its own control 
group! s), these are coded as separate studies. The above applies only to cases where 
multiple treatments and/ or multiple controls are compared altogether in a sin^e 
multi- group study. 

Step 4: Finally, write the names/ descriptions of the aggregate treatment and 
aggregate control group chosen in the designated places at the bottom of the 
GROUPS screen. Note: At this point, the one aggregate experimental comparison to 
be coded has been identified (i.e., one aggregate treatment group compared with one 
aggregate control group) . Only that one aggregate comparison should be considered 
in completing the remainder of the coding. 

GROUP EQUIVALENCE SCREEN 

The unit on which assignment to groups was based [ SH26] (check best one) : 

1 individual juvenile, i.e., some juveniles assigned to treatment, some to 
comparison group (this is the most common case) 

2 classroom, facility, etc., i.e., whole classrooms, etc. assigned to treatment, 
comparison groups 

3 program area, regions, etc., i.e., region assigned as an intact unit 

4 cannot tell 

How subjects assigned to treatment and control groups [SH27] (check best 
one): 

Random or quasi- random: 

1 randomly after matching, yoking, stratification, blocking, etc. (This means 
matched or blocked first then randomly assigned within each pair or block. 

This does not refer to blocking after treatment for the data analysis.) 

2 randomly without matching, etc. (includes also cases such as when every other 
person goes to the control group) 

3 regression discontinuity; quantitative cutting point defines groups on some 
continuum (this is rare) 

4 wait list control or other such quasi- random procedures presumed to produce 
comparable groups (no obvious differences). [This applies to groups vdiich 
have individuals apparently randomly assigned by some naturally occurring 
process, e.g. first person to walk in the door.] 
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Nonrandom, but matched (control group selected to match treatment group) : 


5 matched on pretest measures of some or all variables used later as outcome 
measures (individual level) 

6 matched on demographics: big sociological variables like age, sex, ethnicity, 
SES, (individual level) [Note: If matched on both personal characteristics and 
demographics call it the former not the latter] 

7 matched on personal characteristics, delinquency history, introversion-level, 
self-esteem, etc. other than dependent variables used later as outcome 
measures (individual level) 

8 ecjuated groupwise; e.g., picking intact classroom of similar characteristics to 
treatment classroom e.g. mean age of groups are ecjual. 

Nonrandom, no matching (descriptive data regarding the nature of the group 

differences before treatment must be available for study with this design to be 

eligible; if initially noneciuivalent groups, posttest only, with no information about 

group similarity, then study is not eligible for coding) : 

9 originally random or c^uasi- random but with refusals, exclusions, selections, or 
other degradations after assignment and before treatment starts amounting to 
10 to 15 percent of group or more. [Note: This does not refer to attrition after 
treatment begins, only between point of assignment and onset of treatment, 
e.g. groups selected randomly from school roster but many refuse to participate 
in offered treatment. Treatment drop-out issues are coded elsewhere.] 

10 individual selection on basis of need, volunteering, convenience, or some other 
such factor 

11 convenience comparison groupwise, i.e., other available group such as a 
classroom taken w/ o matching or equating (like individual selection but done 
groupwise) 

12 other: 

13 cannot tell 

Confidence of judgment on how subjects were assigned [SH28]: 

Very Low Low Moderate High Very High 
1 2 3 4 5 

(Guess) (Informed (Weak (Strong (Explicitly 

Guess) Inference) Inference) Stated) 


Identify all the variables for which comparisons were made between the treatment 
and control group prior to application of the treatment. These are comparisons that 
would indicate how similar the treatment and control groups were on some 
variable! s) after assignment to the respective groups but before treatment was given 
to the treatment group. Divide these comparisons into two categories: 
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a) statistical comparisons- variables on which the groups are compared in terms of 
statistics such as means or proportions, or for which the results of statistical 
significance testing is reported; 

b) descriptive comparisons- variables for which it is reported that there is or is not 
a difference but no statistics are provided nor any indication of the results of 
statistical significance testing. 

Number of variables statistically compared prior to intervention [SH30]: 
Number of variables descriptively compared prior to intervention 

[SH31]: 

General Results of Equivalence Comparisons. [SH29] Select ONE (if both, 
use statistical). 

[Note: For the ratings below, an "important" difference means a difference on most 
of the variables, or on a major variable, or large differences; major variables are 
those likely to be related to delinquency, e.g., history of delinquency or other 
antisocial behavior (chargeable offenses), delinciuency risk or prediction, sex, age, 
ethnicity, SES, family circumstances, temperament.] 

Note also that this item is best answered after you make your group ecjuivalence 
effect sizes (described below) so that you can incorporate the magnitude of the effect 
sizes into your decision about their importance. 

1 no comparisons made 

Results of statistical comparison! s): 

2 no apparent differences 

3 differences exist, but judged unimportant by coder 

4 differences exist, judged of uncertain importance by coder 

5 differences exist, and judged important by coder 

Results of descriptive comparison! s) [if no statistical comparisons made]: 

6 negligible differences, j udged unimportant by coder 

7 some differences j udged of uncertain importance by coder 

8 some differences, judged important by coder 

STATISTICAL COMPARISON WORKSHEET 

For each variable identified below on which the treatment and control group were 
compared prior to treatment (other than pretests on outcome variables) OR on 
which you can tell equivalence (e.g., if matched on age, etc.) AND for \riiich 
sufficient data exists, determine the direction of difference and if possible, calculate 
an effect size. NOTE: you only have to make one effect size for each comparison type 
(e.g., if you have two measures of age, like average age in years and average grade, 
you need only make one group equivalence effect size.) 
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In the case of all male samples, there is no need to make a group equivalence effect 
size for sex, although you would use this information is judging group similarity and 
within group heterogeneity below 

Do not include here any comparisons on pretest variables, that is, measures of an 
outcome (dependent) variable taken prior to treatment (e.g., prior number of arrests 
in six- month period when number of arrests in six months subsequent to treatment 
is used as an outcome measure) . In such cases the pretreatment ES is coded later as 
pretest information, not here as group equivalence information. Prior delinquency is 
a pretest for a delinciuency outcome measure if it is in the same form as the posttest 
(e.g. both court records or both self report but not one of each), measures the same 
thing, and covers the same time interval (e.g., 'vdiether arrested in six- month 
period). If the prior delinciuency IS a pretest, DO NOT code it here. One rule is that 
it is a pretest if you could compare this with the posttest and get something 
meaningful. 

(a) A variable is only a pretest if it is operationalized exactly like the posttest in all 
regards except time of measurement. Note especially that for delinciuency 
measures the time period covered must be identical for a pre and post measure 
to ciualify; total prior arrests before treatment is not a pretest for arrests over the 
six months after treatment. 

(b) See codebook for instructions on calculating effect sizes. Be sure the sign of the 
ES is correct- positive ES favors treatment group, negative ES favors control 
group. 

(c) If there is more than one eligible variable in any of these categories, report on the 
one that has the most complete information or, in the case of prior delinciuency 
history and typology, the one most relevant to overall delinciuency risk 

(d) The variables considered here are the same ones that are eligible for coding in 
the section on breakouts and should be coded there if available. 

Type of Comparison [SC4] 

1 Sex 

2 Age 

3 Ethnicity 

4 Prior Delinciuency History 

5 Delinciuency Typology or Risk Level (e.g., type of offender, propensity to commit 
crime, etc.) 

If you have two measures of prior history (like severity and type of offense) use 
severity as prior history and type as typology if you have no other typology 
information. If you have all three either throw out type or aggregate it with severity, 
by averaging the ES values. 
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Direction Favors [SC5] (Direction of the raw difference on the statistics or 
description provided) : 

1 favors treatment group (Tx has fewer males, is younger, has fewer minorities, less 
deling history, or less dehnquency risk) 

2 favors control group (see above) 

3 favors neither (exactly the same, reported as no difference, matched) 

4 ?? cannot tell 


Groups matched on this variable? [SC6] Yes or No 

I I I I treatment group sample size for ES calculation [ SCI] 

I I I I control group sample size for ES calculation [ SC2] 

I I |.| I I effect size (two decimals with an algebraic sign in front: plus if 

favors treatment, minus if favors control) [SC21] 

Once you've coded the group equivalence effect sizes, return to the Header file and 
complete the group equivalence coding. 

Similarity rating [ SH52] : 

Using all the available information, including method of assignment to groups 
(whether random, matched, etc.), rate the overall similarity of the treatment group 
and the comparison group, prior to treatment, on factors likely to have to do with 
delinquency and responsiveness to treatment (ignore differences on any irrelevant 
factors). 

[Note: Greatest equivalence from "clean randomization" with prior blocking on 
relevant characteristics and no subsequent degradation; least equivalence with some 
differential selection of one "type" of individual vs. another on some variable likely 
to be relevant to delinquency, e.g., police referrals for treatment compared with 
"normal" high school sample.] 
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[Guidelines: The bottom 3 points are for good randomizations and matchings, e.g., 
l=clean random, 2^ce matched. The top three points are for selection with no 
matching or randomization. Within this bracket, the question is whether the 
selection bias is pertinent. Were subjects selected explicitly or implicitly on a 
variable that makes a big difference in delinquency? The middle three points are for 
sloppy matching designs, degradations, bad wait list designs, and the like. If the data 
indicate equivalence but the assignment procedure was not random give it a 4 or 
thereabouts since not all possible variables were measured for equivalence between 
groups.] 

Very similar. Very different 

equivalent not equivalent 

1 2 3 4 5 6 7 


Cbnfidence rating [ SH53] : 


Very Low Low Moderate High Very High 

1 2 3 4 5 

(Guess) (Informed (Weak (Strong (Explicitly 

Guess) Inference) Inference) Stated) 


NA for cannot tell 

SUBJECTS SCREEN 

CHARACTERISTICS OE SUBJ ECTS IN TREATMENT GROUP 

[Note: LE =law enforcement; JJ =juvenile justice,-] 

Note: the offense that results in the juvenile entering treatment "coimts" as an 

offense for purposes of this question and the following questions about the juveniles' 

prior histories. 

Predominant level of reoffense risk of treated subjects [SH81] at onset of 

treatment (check best one) : 

1 nondelinquents, normal (no evidence of LE or J J contact or illegal behavior; no 
identified symptoms or risk factors; regular kids) 

2 nondelinquents, symptomatic (no evidence of LE or J J contact or illegal 
behavior, but risk factors such as poverty, family problems, school behavior 
problems, Glueck scale scores, teacher referrals, etc.) 


102 


The Campbell Collaboration | www.campbellcollaboratlon.org 





3 predelinquents, minor police contact (no formal probation or court contact or 
minor self-reported delinquency minor drug infractions, traffic and status 
offenses, counseled and released, etc. ) 

4 delinquents (formal probation and/ or court adjudication but noncustodial or 
significant self-reported delinquency, e.g., bursary, property crimes, auto 
theft; any juvenile who went to court 

5 institutionalized, non JJ setting (e.g., mental health in-patient; not just 
detained pending hearing) 

6 institutionalized, JJ setting (e.g., in group home, camp, reform/ training school, 
etc. imder court order) 

These first six constitute our risk scale; the remaining items are for mixed groups in 

which no sin^e "type" predominates. 

7 mixed, mostly low end of range (nondelinquent & predelinquent) 

8 mixed, mostly moderate to high end of range (predelinquent & 
delinquent/ sometimes institutionalized) [Note: This is appropriate if there are 
offenses for all of the kids.] 

9 mixed, full range (nondelinquent through delinquent/ institutionalized) 

10 cannot tell 

Confidence in judgment of level of delinquency (or crime) risk [SH82]: 


Very Low Low 
1 2 
(Guess) (Informed 

Guess) 

NA for cannot tell 

Number of treated subjects w/ officially recorded priors [SH83]: 
Approximately how many of the treatment juveniles have prior offense records 
(check best one): 

1 none 

2 some ( <50%) 

3 most (= or >50 %) 

4 all (>95%) 

5 some, but cannot estimate proportion 

6 cannot tell 

Predominant type of prior offense reported for treatment subjects 

[SH84] (check best one): 

1 no priors 

2 mixed or imdifferentiated offenses (you know there are offenses but you do not 
know vdiat types or the percentage of subj ects with each) 


Moderate High Very High 

3 4 5 

(Weak (Strong (Explicitly 

Inference) Inference) Stated) 
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3 person crimes (assault, sexual) 

4 property crimes (burglary, theft, vandalism) 

5 drug/ alcohol (possession, sale, public intoxication) 

6 status offenses (runaway, truancy, incorrigible) 

7 other specific: 

8 cannot tell 

Number of treated subjects w/ aggressive histories [ SH85] : Does the history 
of the treated juveniles include any suggestion of aggression, violence, assaultive 
behavior against persons, etc. whether officially recorded or not (check best one) : 

1 no 

2 yes, some juveniles (<50%) 

3 yes, most j uvernles ( = or >50 %) 

4 yes, all juveniles (>95%) 

5 some, but cannot estimate proportion 

6 cannot tell 

Sex of treated subjects [ SH86] or best guess (check best one) : 

1 no males ( >95% female) 

2 some males ( <50 %) 

3 mostly males ( = or >50 %) 

4 all males (>95%) 

5 some males, but cannot estimate proportion 

6 cannot tell 

Approx, mean age of treated subjects at time of treatment [SH87](one 
decimal; 99.9 if cannot tell) [Note: Code best information available even if must 
estimate, e.g., from grade levels] 

How reported? [SH88]How reported/ determined (check one used): [Note: Listed 
in order of preference; if have choice, take higher form in list] 

1 median 

2 mean 

3 mode 

4 midpoint of range 

5 inference from school grade or other such information 

6 not applicable 

Predominant ethnicity of treatment subjects: [SH89] more than 60% of 
juveniles (check best one or best guess) : 

1 Anglo 

2 Black 

3 Hispanic 

4 other minority 
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5 mixed ( several, but none more than 60 %) 

6 mixed, but cannot estimate proportions 

7 cannot tell 


Using above information, how heterogeneous is the treatment group? 

[SHOO] Overall heterogeneity rating: Based on all the evidence available, how 
diverse or heterogeneous is the treatment group with regard to delinguency history, 
demographics, personal characteristics, and conditions relevant to delinguency, 
etc.? [Note: The issue is one of within group heterogeneity. A highly selective group 
would rate 1 or 2 and a program that takes all comers would rate a 6 or 7.] 

Very 1 2 3 4 5 6 7 Very 

Homogeneous Heterogeneous 

(J uveniles guite (J uveniles guite 

similar to each other) different from each other) 

cannot tell 


Confidence in homogeneity rating: [SH91] 


Very Low Low 

Moderate 

High 

Very High 

1 2 

3 

4 

5 

(Guess) (Informed 

(Weak 

(Strong 

(Explicitly 

Guess) 

Inference) 

Inference) 

Stated) 


NA for cannot tell 

CONTROL SCREEN 

WHAT'S DONE TO CONTROL GROUP [SH54] 

What the control group receives (select best one) : [Note: The difference between 

'receives nothing' and 'treatment as usual' hinges on whether or not the two groups 

have an institutional framework or experience in common, e.g, probation 

supervision, institutionalization, school.] 

1 receives nothing; no evidence of any treatment or attention; may still be in 
school or on probation etc., but that is incidental to the treatment strategy or 
client population as defined 

2 wait list, delayed treatment control, etc.; contact limited to application, 
screening, pretest, posttest, etc. 

3 minimal contact; instructions, intake interview, etc. ; but not wait listed 

4 parole— treatment as usual 

5 school— treatment as usual (if treatment delivered in a school setting) 
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6 probation— treatment as usual(if treatment delivered in a juvenile justice 
setting) 

7 institutionalization— treatment as usual 

8 other— treatment as usual 

9 attention placebo, e.g., control receives discussion, attention, or dilute version 
of treatment 

10 treatment element placebo; control receives target treatment except for defined 
element presumed to be the crucial ingredient 

11 alternate treatment; control is not really a "control, " but another treatment 
(other than "usual" treatment) being compared with the focal treatment [ Such 
comparisons are not eligible for coding unless the alternate treatment is 
designed as a contrast to a focal treatment, e.g, a very dilute dose or a "straw 
man" not expected to perform well.] 

12 cannot tell 

Overall confidence of judgment on what control group receives; [SH55] 


Very Low Low Moderate High Very High 
1 2 3 4 5 

(Guess) (Informed (Weak (Strong (Explicitly 

Guess) Inference) Inference) Stated) 


NA for cannot tell 

Text box for notes about control group 

Describe the character of the control group briefly with particular attention to any 
experiences they have in common with the treatment group (e.g., "also on 
probation") and what part of their experience is distinctly different from that of the 
treatment group (e.g., "in regular institution rather than cottages and doesn't 
participate in the guided group program") . 


TREATMENT SCREEN 

WHAT'S DONE TO TREATMENT GROUP 


Source of clients for treatment [SH56] (check best one): [Note: The issue here 
is who took the initiative in identifying or choosing subjects for the treatment, e.g, 
were they identified by teachers or by researchers using the teachers' records?] 

1 sought treatment voluntarily ( "self- referral, " "walk- in") 

2 referred/ identified by parents, friends 

3 referred/ identified by non CJ community agency (schools, teachers, mental 
health, etc.) 
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4 referred/ identified by Q agency, but "voluntaiy" (e.g, via police, probation, 
court, etc.) 

5 referred/ identified by Q agency, but participation mandated (e.g, by court, 
terms of probation, institution), [/^ume it is mandatory if it is a CJ agency 
unless there is specific information that it is voluntary. Don't override a specific 
statement that it's volimtary even if you presume, there is some coercion.] 

6 referred/ identified by multiple sources, none predominates 

7 solicited or arranged by researcher 

8 other 

9 cannot tell 

Type of treatment; Link to Service Codes Screen 

Overall confidence in judgment about type of treatment; [SH59] 


Very Low Low 
1 2 
(Guess) (Informed 

Guess) 

NA for cannot tell 

Who administers treatment [SH61] (check best one): 

1 criminal justice or juvenile justice personnel (e.g, police, probation officer, 
judge, etc.) 

2 school personnel (e.g., teachers, principals) 

3 mental health personnel (public agency) 

4 mental health personnel (private agency, coimselors, etc.) 

5 non mental health professionals, coimselors, consultants, etc., e.g, vocational 
counselors 

6 laypersons, e.g, volunteers, college students, ex-delinguents 

7 researcher/ research team 

8 other: 

10 mixed, multiple personnel (contact with more than one treatment delivery 
person &none is clearly focal). Do not use this option when different subjects 
are seeing different types of personnel. In those cases, select a focal personnel 
type. 

11 cannot tell 


Moderate High Very High 

3 4 5 

(Weak (Strong (Explicitly 

Inference) Inference) Stated) 
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Format of treatment sessions [SH62] (check best one; if mixed, check 
predominant category) : 

(Note: The primary emphasis of this question is on who was present with the 
juvenile during treatment, emphasis on number of providers present is secondary) 

1 juvenile alone (self- administered treatment, e.g., bibliotherapy) [This refers to 
a treatment in which nobody else is present. If it is restitution performed in a 
group it does not belong here but if a juvenile is sent out to do something (like 
get a job) it goes here.] 

2 j uvenile and provider, one on one 

3 j uvenile group, one or more providers 

4 j uvenile with family/ parents, one or more providers 

5 parents only, j uvenile not present 

6 teachers, probation officers etc. only; juvenile not present 

7 mixed; no sin^e format predominates 

8 other: 

9 cannot tell 

Nature of treatment site; [ SH63] site on which treatment generally delivered 
(check best one in each set) : [Note: Customary treatment location irrespective of 
who administers treatment. If restitution is the treatment, the site will be mixed, 
none predominates.] 

1 Public fadlity (i.e., owned and operated by city, county, state, federal 
government body), J USTICE-ORIENTED, e.g., probation dept, police station, 
reform school 

2 Public fadlity (i.e., owned and operated by dty, county, state, federal 
government body), NOT JUSTICE-ORIENTED, e.g., school, dept, mental 
health 

3 Private fadlity, e.g., YMCA, private counseling agency, university (even if state 
university) 

4 mixed, none predominates 

5 other: 

6 cannot tell 

Custodial/ residential facility? [SH64] e.g., camp, reformatory. Psychiatric 
hospital, halfway house, foster home, etc. 

1 yes 

2 no 

3 mixed, neither predominates 

4 cannot tell 
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Formal setting? [SH65] (e.g., office, classroom, institution, laboratory, etc.) 

1 yes 

2 no, informal, e.g., outdoors, streets, juvenile's home, etc. 

3 mixed, neither predominates 

4 other: 

5 cannot tell 

SERVICE CODES SCREEN 
Treatment description [SHlOOtxt] 

Relationship of J uveniles in Treatment to the J uvenile J ustice System 
[SHIOO] 

The purpose of this item is to capture the status of the juvenile at the time treatment 
was actually received. J uvenile justice supervision means that they are offidallv 
supervised while on probation, in a residential/ custodial facility, or on 
parold aftercare and can be sanctioned by the J J authorities if they fail to comply 
with the terms of that supervision. A juvenile is not under the authority of the J J 
system if they are not being monitored on an on-going basis by J J authorities. Non- 
JJ supervision can include juveniles that were routed to services via the JJ system 
(diversion), but are participating in the services without official T T supervision. 

Yes, juveniles under JJ supervision (under the authority of thej J system)when 
they received the treatment 

On probation (under probation supervision but not in custodial institution nor 
aftercard parole after a term in a custodial institution) . 

1 on probation, in community (or no indication that not) . Describe: 

2 on probation but in a residential or partially residential setting, e.g, day 
treatment, probation camp. Describe: 

In a juvenile justice custodial institution, e.g, training/ reform school, borstal, 
detention center, juvenile correctional institution. 

3 "regular" juvenile correctional institution (or no indication that not) . Describe: 

4 alternative or special form of custodial institution, e.g., cottage format, 
psychiatric correctional ward. Describe: 

On J J supervised parole of aftercare after a term in a custodial institution (after 
incarceration). 

5 nonresidential J J parole or post- custodial aftercare. Describe: 

6 partial residential JJ parole or post- custodial aftercare, e.g, day treatment 
program. Describe: 

7 fully residential JJ parole or post- custodial aftercare, e.g, group home, halfway 
house. Describe: 

Any other form of JJ supervision or under JJ authority but cannot tell which of 
above is applicable. 
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8 other JJ supervision. Describe: 


No, juveniles not under JJ supervision when treatment received 
(through some route such as diversion hy law enforcement or juvenile 
justice personnel, and are not under JJ supervision while in 
treatment.) 

Note: If juveniles initially involved with police or juvenilejustice system but then 
diverted away from official JJ processing and released or sent to a community 
program, note this in the write-in space for description for the option to which it 
applies. Such a situation may involve the threat of J J processing if treatment is not 
completed but the juvenile will not actually be under JJ supervision at the time of 
treatment following the diversion. 

9 in the community with no apparent constraints or residential program 
arrangement. Describe: 

10 in a non-J J partially residential setting, e.g., non J J day treatment program, 
alternative school. Describe: 

11 in a non-J J fully residential setting, e.g, group home, foster care. Describe: 

12 other non JJ situation. Describe: 

All other or cannot tell which of the above apply. 

13 Cannot tell. Describe: 

Treatment Components 

Identify all the treatment components, elements, activities, experiences, etc. 
reported as part of the intervention. Note that to gualify, a component should be 
something the treatment group receives that the control group does not receive. Use 
the following rating scale for each reported component. At least one component 
must be rated for every intervention but as many components can be rated as 
needed to describe every distinct element reported. 

Some items are listed multiple times and are indicated with a similar superscript. 
Although an item may be listed under several categories, it should only be rated one 
time for each intervention. Items that are in bold type are considered "brand name" 
interventions. These should only be chosen if mentioned specifically by name within 
the study report! s). If the treatment description sounds like it has all or most of the 
components of a particular "brand name" intervention, but it is not specifically 
called by that name, place it in the "similar to" category. 


It is important to assign a code to all treatment components mentioned for each 
intervention using the numerical scheme below. Initially you should assume that 
each such component will receive a rating of "1," like "1" was a checkmark to check 
off every item present. However, if there is any indication in the study report! s) that 
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one or more components are of lesser scope or importance than others, then those 
secondary items should be coded "2." A component might be identified as secondary 
in this sense because: 

a) it is clearly a subcomponent of something else (e.g., role-playing as one of several 
parts of a attitude change session) or there is a broad program type to be coded 
"1" (e.g, interpersonal skills building) and the component is only one aspect of 
that (e.g„ anger management exercises); 

b) it is provided to only a subset of juveniles or only occasionally in contrast to 
other components provided to all juveniles or on all occasions (e.g., a service that 
some juveniles are referred to only if they need it while others are provided to 
all) 

c) some other distinction is made that shows that the component is not of egual 
importance, stature, or scope as others that are coded "1" 

If there is no basis for distinguishing any components as having less importance, 
scope, stature, etc. than any other, code all as "1." If you have some reason to doubt 
that all the components are at the same level, but a clear determination cannot be 
made about ^^fiich should be coded "1" and ^^bich "2," then code all the imcertain 
components as a "9." 

1. treatment component with no indication that it is a subcomponent, of less scope, 
provided to fewer juveniles, etc. than any other component 

2. a treatment component that is a subcomponent, of less scope, provided to fewer 
juveniles, etc. than some other component 

one of a set of components that may be at different levels ( "1" vs "2" above) but it 
is imcertain i^bich is which (i.e. cannot clearly and comfortably determine if a 
component is a "1" or "2") 

J J or CJ -type Treatment Elements 

[tcl] probation, regular (compared to no probation supervision) 

[tc3] parole^ aftercare, regular (compared to no parole^ aftercare 

supervision) 

[tc5] institutionalization, regular (jail, detention center, prison, etc. 

compared to no institutionalization) 

[tc7] early release from institution, probation/ , or parole (shortened 

sentence) 

[tc8] furloughs from custody (e.g., family visits, field trips without JJ staff 

members) 

[tcl23] work release program (e.g, work in the community while still 

incarcerated) 

[ tcl24] work program (work in the institution while still incarcerated) 

[tc9] intensive supervision or monitoring, reduced caseload, smaller 

units, more freguent drug screens 
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[tclO] community monitoring (e.g, sex offender registry, electronic 

bracelet) 

[tell] drug court (e.g., more lenient sentencing to substance abuse 

treatment in closed facility) 

[tcl2] prison visit, not overnight (e.g., scared straight, etc.) 

[tcl3] short term "shock" incarceration (juvenile stays overnight at least 1 

night) 

[tcl4] deterrence threat (e.g., straight talk with police officers, 'lecture and 

release") 

[tcl37] Teen Court, type of alt. sentencing & peer review/ sentencing format 

[ tcl5] military style 'boot camp" ( relatively short term) 

[ tcl6] restitution, fines or payment/ service to victim or victim's family 

[tcl7] restitution, community service (e.g, landscaping, hospital, nursing 

homes, etc.) 

[tcl38] restitution, contact with victim (e.g, apology letters, apology in 
person) 

[tcl8] diversion specifically stated as a descriptor of the program 

[tc2] alternative to probation (would be on probation but something else 

instead) 

[tc4] alternative to institutionalization (would be institutionalized but 

something else instead) 

[ tc6] alternative to parole^ aftercare (would be on parole^ aftercare but 

something else instead) 

[tcl22] receives treatment/ service program instead of JJ supervision 

[tcl25] receives probation instead of greater supervision, e.g., 

institutionalization 

[tcl36] receives informal probation instead of greater supervision, e.g, 

regular probation, institutionalization 
[tcl9] other 

Residential Components 

[ tc20 ] psychiatric facility 

[ tc2 1] teaching family home 

[ tc2 Is] similar to teaching family home 

[ tcl39] emergency shelter/ shelter house 

[tc22] group home; foster parents 

[tc23] wilderness camp, short term- two weeks or less in camp ( e.g. 

Outward bound) 

[tcll8] wilderness camp, not short term- more than two weeks 

[tcl5] boot camp 

[tc25] other camp 

[ tc26] residential drug treatment 
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[tc27] boarding school / residential training school, (cottage model, small 

scale/ disaggregated) 

[tc28] guided group interaction, in a residential setting (eg., offenders 

determine rules & punishment ) 

[tc28s] similar to guided group interaction 

[tclll] positive peer culture, in a residential setting (eg., members are 

responsible for themselves as well as others and serve as catalysts for 
helping others and advancing the group) 

[tcllls] similar to positive peer culture 

[tc29] therapeutic conununity 

[tc29s] similar to therapeutic community 

[tc30] milieu therapy 

[tc30s] similar to milieu therapy 

[tc31] other 

Educational Components 

[tcl35] school- based: program provided in regular school setting 

[tc32] special classes or educational field trips 

[tc33] continuation/ additional school, (not employment related) 

[tc34] tutoring, or current level of education (not employment related) by 

whom? 

[tc35] remedial education, (not employment related) 

[tcl20] alternative school, as alternative for regular (e.g, public) school 
[tcl60] educational testing 

[ tcl40 ] assigning homework 

[tcl41] teaching juveniles study technigues 

[tcl42] academic monitoring (e.g., monitoring homework, academic 

performance, attendance, etc.) 

[tcl61] computer classes (academic- separate from vocational) 

[tc36] other 

Counseling Components 

[tc37] individual counseling, therapy, psychotherapy, guidance; by whom? 

[tc38] group counseling, therapy, psychotherapy; by whom? 

[tcl27] group counseling, led by a facilitator but not necessarily "talk 

therapy" (e.g, facilitated discussions) 

[tcll2] guided group interaction, (nonresidential) 

[tcll2s] similar to guided group interaction! nonresidential) 

[tcll3] positive peer culture (nonresidential) 

[tcll3s] similar to positive peer culture (nonresidential) 

[tcll4] multi-systemic therapy 

[tcll4s] similar to multi- systemic therapy 
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[tcl43] client- centered therapy 

[tc40] family counseling, family systems, functional family therapy, etc. 

(w/ \A^ole family or juv and parent) 

[tcl44] multi- family groups, (e.g, "family group" participates in counseling 

as a whole along with other families 
[ tc4 1] parent counseling without j uvenile, individual 

[tc42] parent counseling without juvenile, parent groups 

[ tc43] alcohol counseling ( see also Drug and Alcohol Components) 

[ tc44] casework: support/ services provided by caseworker ( not case 

manager) interceding with others, helping juvenile, etc. ("all- 
purpose") 

[ tcl45] in home counseling, counseling takes place in the home of the 

juvenile or family 

[ tc45] mediation ( counselor mediates/ arbitrates between parties in conflict 

or victim and offender) 

[tc46] 4jTeoeational therapy, (see also Recreational Components) 

[tc47] reality therapy 

[ tcl46] sex offender counseling 

[tc48] crisis counseling, response (e.g, come out to house to intervene) 

[tcll9] non-specific counseling (not otherwise identified) 

[tc49] other 

Recreational Components 

[tc46] recreational therapy 

[tcl21] recreation (non-specific) 

[tc51] fitness programs (e.g, weights, sports- not for competition, increased 

exercise) 

[tc52] sports, athletics, or athletic events 

[tc53] parties, games, recreational outings, field trips (other than 

educational) 

[tcl47] adventure- based activities, ropes course, canoeing, etc. 

[tc54] arts & crafts, drama, music, dance activities, games, etc. (groups and 

individually) 

[tc55] other 
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Interpersonal/ Personal Skill Components 


[tc56] 

[tc57] 

[tc58] 

[tc59] 

[tc60] 

[tc61] 

[tc62] 

[tcl48] 


[tc63] 

[tc64] 


interpersonal skills building (e.g, armniunication skills, role playing, 
assertion training) 

resisting group pressure, responding to persuasion 
peer/ group interaction (meetings, discussions, activities) 
mentor provided for juvenile (peer, volunteer, layperson, "big 
brother") 

juvenile served as mentor as part of tx 
moral education, training; religious or spiritual program 
interpersonal problem solving, conflict resolution, decision making 
personal/ self development training (e.g., self esteem building, 
focusing on indiv. strengths, self-awareness, leadership, goal setting, 
etc.) 

anger management (other than cognitive behavioral); stress 

management, (see also cog anger mgmt) 

other 
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Cognitive Skills / Cog Restructuring Components 


[tcll5] cognitive/ behavioral intervention (overall focus on altering 

irrational thinking and behavior) 

[tcllSs] similar to cognitive^ behavioral intervention 

[tc65] cognitive restructuring (monitoring automatic thoughts, correcting 

distortions/ thinking errors, etc.) 

[tc66] cognitive anger management (hassle logs, identify triggers, use self- 

statements and anger reducers, etc.) 

[tc67] moral reasoning; empathy &victim impact (moral dhemmas; 

perspective taking; empathy for victim) 

[tc68] attitude change, accepting authority & rules, new attitude towards 

law, court, police, peers, etc. 

[tc69] relapse prevention plan; interventions for lapses; high-risk situation 

planning 

[tc70] other, describe 

Behavioral Components 

[tc71] behavioral contracting, contingency management; behavior 

modification; (e.g., rewards; shaping of specific behaviors; 
reinforcement for desired behaviors) 

[tc72] behavior modification (e.g, rewards, shaping, reinforcement of 

behaviors, etc.) 

[tc73] punishment, discipline (e.g, segregation, privileges taken away, 

denial of family visits) 

[tc74] token economy - tokens earned, redeemable for privileges, goods, 

etc. 

[ tc75] learning by modeling 

[tc76] desensitization, exposure-f-response prevention, flooding 

[tc77] relaxation training (e.g., deep breathing, counting backward, imaging 

of peaceful scenes) 

[tc78] meditation (mindfulness therapy, living in the moment, yoga, 

transcendental meditation) 

[tcl49] role playing (non- specific or a general activity, not a technigue used 

with another component) 

[tc79] anger reducing technigues (e.g, push-ups, time-outs, walking 

around) - (see also cognitive anger mgmt) 

[tc80] other 
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Employment Components 


[tc8 1] remedial education, employment related; any functional education 

(literacy, GED, arithmetic) 

[tc82] tutoring (one on one), teaching machine, help to achieve academic 

success (employment related) 

[tcll6] continuing education (employment related) such as special or 

advanced classes 

[tc83] employment; supervised group work program 

[tcl28] employment; job placement for individual juveniles 

[tc84] career counseling, (career exploration, job readiness, job searching 

skills, interview skills) 

[tc85] job training - learning newjob content, trade, specific skills (e.g., 

welding, construction, computer) 

[tclSO] vocational field trip (separate from educational or recreational field 
trip) 

[tclSl] non- paid work service (e.g., community service not in conjunction 

with restitution, etc.) 

[tcl62] computer classes (vocational— separate from academic) 

[tcl86] other 

Life Skills/ Needs Components 

[tc87] personal management (attendance, housing issues, time/ money- 

management skUls) 

tc88] managing daily life problems (problem soMng, social/ moral 

reasoning, balancing responsibilities) 

[tc89] challenge programs, short term (e.g. survival training, outward 

bound) 

[tc90] parenting/ family skhls for parent of target juvenile; (parent 

effectiveness training alone or with juvenile) 

[tcl52] provides necessities (e.g, clothes, transportation, food, etc.) 

[tc91] health- related prevention (pregnancy, STD) 

[tcl53] health education (e.g., personal hygiene, nutrition, etc.) 

[tcl54] legal education (juveniles learn about the judicial system and judicial 

processes) 

[tc92] other 
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System-Oriented Components 


[tc93] advocacy on behalf of youth (must be clearly identified as all or part 

of the treatment program) 

[tc94] consultation, assistance to schools/ agencies responsible for juveniles' 

welfare 

[ tc95] special training for service providers, ( school staff, counselors, 

probation officers) 

[tc96] fadlitative assistance for service providers, other than training (group 

discussions, information sharing) 

[tc97] parents of juvenile offender receive skill building intervention other 

than parenting sldlls (w/ o juvenile) 

[ tcl55] regular contact with parents (parental involvement) 

[ tc98 ] outreach workers, streetworkers ( service personnel working with 

gangs, schools, etc. ) 

[tc99] other 

Drug and Alcohol Components 

[tclOO] drug, alcohol education 

[tc43] ^drug, alcohol coimseling/ therapy, (AA or NA) 

[tcl56] drug testing (conducted either on a regular or random basis) 

[tcl02] other, (see also Behavioral Components) 

Pharmacological Medical, Biological Components 

[tcl03] psychiatric intervention (e.g., access to psychiatrist for evaluations & 
prescriptions) 

[tcl57] medical/ emergency service 

[tcl04] change in behaviors, diet, medication, sleep, etc., describe: 

[tclO 5] physical examination and necessary treatment (medicine) 

[tcl06] other 
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Multimodal Components 


[tcl07] 

[tcl58] 

[tcl59] 

[tcl08] 

[tcl09] 

[tcllO] 


service brokerage: evaluation/ assessment of service need, referral to 
treatment; provided by an agency 

psychological assessment (separate from assessment for service 
brokerage) 

individualized treatment plans provided for juveniles 
multimodal service - program tailored to juveniles receiving multiple 
tx components 

case management (case manager identifies needs, oversees services 

by multiple agencies, etc. 

other 
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All Other 


[tcll7 &tcl29-tcB4] any other treatment component, element, technique, 

etc. identified in study report! s) and not coded above. Describe with at least 
moderate detail if possible: 


IMPLEMENTATION SCREEN 

TREATMENT IMPLEMENTATION/ STRENGTH/ INTEGRITY 

[Note: For this item and the next three use "facts" if available, otherwise "format". 
Make an informed guess about the amount and frequency of contact vdienever 
possible. Even if the guess is inaccurate, it will help give an order of magnitude 
estimate for the analyses' Assume that a counseling session and a school period are 
probably each an hour long.] 

Approximate duration of treatment in WEEKS [SH68] from first treatment 
event to last treatment event. Include treatment received by treatment subjects up to 
the time of posttest measurement. Divide days by 7 and round; multiply months by 
4.3 and roimd. Code 999 if cannot tell. Estimate for this item if necessary and if you 
can come up with a reasonable order of magnitude number. If no other information 
is provided in the study, you can assume that probation lasts 6 months and crisis 
coimseling lasts 2 weeks. 

Determined by [SH69] (select one): 

1 facts (data about how long clients in treatment, e.g., average client attended 7.3 
weeks) 

2 format ( standard package or plan without information on actual participation, 
e.g., a ten- week program) 

3 other estimate (e.g., coder's best guess) 

Erequency of treatment event/ contact [SH70] (check best one) [Note: This 
refers only to the element of treatment that is different from what the control group 
receives. Estimate for this item if necessary and if you can come up with a 
reasonable order of magnitude number.] 

1 continuous (e.g., milieu therapy, residential program, pharmaceutical therapy, 
parant effectiveness training) 

2 daily contact ( not 24 hours of contact per day but some treatment during each 
day, perhaps excluding weekends) 

3 2-4 times a week 

4 1-2 times a week 

5 less than weekly 

6 cannot tell 
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Determined by [SH71] (select one): (for continuous treatments assume format 
unless have specific information about discrepancies from the prescribed format) 

1 facts (data) 

2 format ( standard package^ plan) [ code continuous treatments here] 

3 other estimate (e.g., coder's best guess) 

Approximate mean HOURS of contact per WEEK [SH72] (888 if 
institutional) : actual contact time between juvenile and provider or treatment 
activity per week per juvenile if reported or calculable (Round to one decimal place. 
Code 888 for institutional residential, or around the clock program; code 999 if not 
available) [Note: Estimate for this item if necessary and if you can come up with a 
reasonable order of magnitude number.] 

Determined by [SH73]( select one): 

1 facts (data) 

2 format ( standard package/ plan) [ code continuous treatments here] 

3 other estimate (e.g., coder's best guess) 

Approximate mean HOURS of TOTAL contact [SH74] over full duration of tx: 
contact between juvenile and provider or treatment activity over full duration of 
treatment per juvenile if reported or calculable (Roimd to whole number. Code 8888 
for institutional, residential, or aroimd the clock program; code 9999 if not 
available) [Note: Estimate for this item if necessary and if you can come up with a 
reasonable order of magnitude number. No decimals here, whole numbers only.] 

Determined by [SH75]( select one): 

1 facts (data) 

2 format ( standard package^ plan) 

3 other estimate (e.g., coder's best guess) 

Overall confidence in estimates of treatment contact: [SH76] 

Very Low Low Moderate High Very High 

1 2 3 4 5 

(Guess) (Informed (Weak (Strong (Explicitly 

Guess) Inference) Inference) Stated) 

NA for cannot tell 


Evidence of uncontrolled variation in implementation? [SH77] 

Based on evidence or author acknovdedgment, was there any uncontrolled variation 
or degradation in implementation or delivery of treatment, e.g., high dropouts, 
erratic attendance, treatment not delivered as intended, wide differences between 
settings or individual providers, etc. (check best one) : [Note: This question has to do 
with variation in treatment delivery not research contact. E.g., there is no "dropout" 
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if all juveniles complete treatment even if some fail to complete the outcome 
measures; degradation does not mean attrition per se. Implementation and delivery 
of treatment to the treatment group partly overlaps the research methodology 
attrition issue but also includes other aspects involving the treatment itself. Assume 
that there is no problem if one is not specified and the format seems reasonably 
structured.] 

1 yes (describe: ) 

2 possible (describe: ) 

3 no, apparently implemented as intended 

4 cannot tell 

Taking all evidence into consideration, rate the intensity of the treatment along the 
two dimensions below: 


Rate amount of meaningful contact [SH78] between subject and treatment 
(frequency, duration). Amount of meaningful contact between juvenile and 
treatment (frequency, duration) : [Note: Use the number of hours of contact to 
determine whether the treatment falls into the bottom, middle, or high end of the 
scale and then adjust the rating according to the meaningfulness of the contact. Try 
to reflect any slippage between format of treatment and actual amount of contact. 
Fifteen hours of basketball would rate lower than fifteen hours of counseling because 
there is less contact with the change agent. A total institution experienced for a long 
time would rate a "7", a two week wilderness program or a 10 week, once a week 
crisis intervention program would rate about a "4", high slippage and low 
participation would yield a rating of T" or "2". A 2 hour per day program would be 
about a 6 which would be moved down if there is lots of slack time. Fifteen minutes 
per week would be about a 1; an hour per week or less would be a 2 or 3. 


Trivial 

Substantial 


1 2 3 

cannot tell 


4 


5 


6 


7 


Rate intensity of typical tx event [SH79] (involving, emotional, etc.) 

Intensity of typical treatment event; how involving, emotional, memorable, etc. per 
contact irrespective of amoimt of contact: [Note: Intensity relates to the likelihood 
that this treatment will cause a psychological change or emotional reaction in the 
juvenile whether therapeutic or not. Scared straight or a wilderness program would 
rate a "6" or "7", standard counseling would rate somewhere between "3" and "5", 
and a boy's club after- school basketball program or informal probation would rate 
somewhere between "1" and "3".] 


Weak 1 2 3 4 5 6 7 Strong 

cannot tell 
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Overall confidence in treatment ratings: [SH80] 


Very Low Low Moderate High 

12 3 4 

(Guess) (Informed (Weak (Strong 

Guess) Inference) Inference) 

NA for cannot tell 

Dependent Variable Coding Sheet (DV) 

For the aggregate experimental comparison coded on this sheet, identify the 
dependent (outcome) variables on vdiich treatment vs. control group comparison 
could be made (^^bether actually made or not) distinguishing delinguency vs. 
nondelinguency measures. If it is hard to decide vdiether a measure reflects 
delinguency or not, err on the side of calling it a nondelinciuency measure so that the 
delinciuency measures used in the analyses will be fairly unambiguous. Each 
dependent variable represents a contrast between two groups often reported as a 
test of significance. 

Exclude variables that reflect only the degree of implementation of the intervention. 
Exclude variables that do not apply to the entire aggregate comparison, e.g, 
measures that subdivide categories of another measure such as sin^e vs. multiple 
offenses only for those that recidivate. Also exclude variables that do not represent 
the status (behavior, attitudes, etc. ) of the juveniles in the treatment and control 
groups but rather the status of others, e.g, teachers, parents, juverules outside the 
experiment. Note that it is okay for teachers, parents, etc. to be the primaiy 
treatment recipients (e.g, parent effectiveness training) but dependent variables are 
nonetheless only coded for the subsec^uent status of the juveniles involved (e.g., 
children of those parents) . Note also that it is okay for a dependent variable to 
represent the observations, opinions, etc. of someone other than the juvenile so long 
as it is something about the juvenile on c^bich they are reporting (e.g., parent 
opinion about whether the juvenile has improved) . 

If the same variable is used repeatedly for follow-up, etc. count it only once. 
Otherwise, list every dependent variable that can be identified as having been used 
in the study irrespective of how much information is available on it. Write in a brief 
label for each below: 

DELINQUENT BEHAVIOR OUTCOME MEASURES (LIST ALL) 

[Definition: Delinguency outcome measures are those that index the degree of 
criminal or delinguent behavior (constituting at least one chargeable offense) . Direct 
reports of criminal/ delinguent behavior are always included here whether self- 


Very High 
5 

(Explicitly 

Stated) 
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report from the delinquent or records from police, probation, courts, etc. Also 
included here are other reports of delinquent behavior such as some school or 
teacher reports, e.g., having to do with disciplinary actions related to (chargeable 
offenses) . The key factor in the delinquency vs. nondelincyiency decision are (a) the 
measure has to do with behavior; non-behavioral constructs, e.g, attitudes, 
personality trait measures, etc., should be classified as nondelinciuency; (b) the 
activity involved is officially defined delinciuency, or related, or else is antisocial 
behavior in the sense of causing clear harm to persons, property, or self.] 

Verbal tags: 

On Cbdesheet DM, code each of the above variables for which some treatment group 
vs. control group comparison can be made, even if only a statement of 
nonsignificance, no difference, or direction of effects. Code only those DVs for which 
there is a statement of the direction of the effect even if that statement is that there 
was no significant difference. Place a checkmark on the list above beside each 
variable selected for coding. [Note: There will be four types of dependent measures: 
those that were measured but not mentioned (lost), those that were mentioned with 
no statement of results, those that were mentioned with a statement of significance 
or direction, and those that provide enough information to calculate an effect size. 

All but the first category should be listed here; all in the third and fourth categories 
should be coded.] 

For status offenses (those that are only offenses because the perpetrators are minors, 
e.g, runaway, truancy, curfew, incorrigible) it is a delinquent behavior if it is 
presented as an offense in a law enforcement framework (e.g, police or court 
records), but is a non- delinquent behavior if it is presented in a non- law 
enforcement framework (e.g, school records) . Fighting or other clearly antisocial 
behaviors (chargeable offenses) (extorting money, beating up fellow students, etc.) 
are delinquent regardless of the framework in which they are presented. Indicate the 
appropriate numbers below: 

I I I Number of dehnquency variables selected for coding 

I I I Number of dehnquency variables omitted 

[Note: These two values should sum to the total number of variables on the above 
list. Do not skip this; it is important] 

NONDEUNQUENCY OUTCOME MEASURES (UST ALL) : 

[Definition: Nondelinquency outcome measures are all those that remain after any 
delinquency outcome measures are coded on the "delinciuency behavior outcome 
measures oodesheet" according to the definitions on that codesheet.] 

Verbal tags: 

On Cbdesheet NM, code each of the above variables for vdiich some treatment group 
vs. control group comparison can be made, even if only a statement of 
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nonsignificance, no difference, or direction of effects. Code only measures 
representing the behavior, attitudes, perceptions, etc. of juveniles, not measures of 
the behavior, etc. of others, e.g., teachers, parents, etc. even if they are the recipients 
of the treatment. Place a checkmark on the list above beside each variable selected 
for coding. Indicate the appropriate numbers below: 

I I I Number of nondelinquency variables selected for coding 

I I I Number of nondelinquency variables omitted 

[Note: These two values should sum to the total number of variables on the above 
list.] 

Delinquency Variables 

Code a separate screen for each delinquency outcome measure for which the 
aggregate treatment and control groups can be compared on the first wave of post- 
treatment outcome. (Subsequent waves and breakouts for this aggregate comparison 
are coded on separate attachments to be appended to this sheet). Delinquency 
outcome measures are those that index the degree of criminal or (delinciuent) 
behavior. Direct reports of criminal/ delinciuent behavior are always included here 
whether self-report from the delinciuent or records from police, probation, courts, 
etc. Also included here are other reports of delinciuent behavior such as some school 
or teacher resorts, e.g., having to do with disciplinary actions related to dehnciuent 
behavior. The key factor in the delinciuency vs. nondelinciuency decision are 1) the 
measure has to do with behavior; non- behavioral constructs, e.g. attitudes, 
personality trait measures, etc., should be classified as nondelinciuency; 2) the 
activity involved is officially defined delinciuency, or related, or else is antisocial 
behavior in the sense of causing clear harm to persons, property, or self. 

Type of delinquency/ recidivism represented [Dl] by this measure (c^fiat's 
coimted, irrespective of source of information and authors' label or description of 
the measure) (check best one): 

1 antisocial behavior, not specifically restricted to criminally delinciuent acts 

2 unofficial delinciuent behavior, e.g., from self or observer's report 

3 school disciplinary actions relating to delinciuent/ antisocial behavior 

4 arrests or police contacts 

5 probation contact, violations, actions, etc. 

6 court contact, actions, petitions, convictions, appearances, etc., excluding 
institutionalization 

7 parole contact, violations, action, etc., excluding reinstitutionalization 

8 institutional disciplinary actions ( relating to delinciuent/ antisocial activity) 

9 institutionalization or reinstitutionalization 

10 catchment area crimed arrest rates (Treatment for entire area) 

11 catchment areaJJ indicators, e.g., probation, court, parole events 

12 other: 

13 cannot tell 
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Definitional boundaries for measure [D2] (check best one): 
01 all "offenses" included (except, perhaps, traffic offenses) 

Restricted by type 
0 2 substance abuse only 
0 3 property crime only 
04 person crimes only 
0 5 status offenses only 

06 criminal offenses only, i.e., all but status offenses 

07 other 

Restricted by severity 

0 8 only maj or/ felony 

0 9 only minor/ misdemeanor 

10 other severity restriction 

11 other type of restriction: 

12 cannot tdl 
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Elements reported in measure: [D3] Elements reported in this delinquency 
measure irrespective of type incident and reporting source (check best one) : 

1 global dichotomy or polychotomy (e. g, offended or recidivated, yes/ no) 

2 summed dichotomous (e.g., sum of yes/ no on list of specific offenses) 

3 frequency or rate, (count of incident; incidents per 1000 persons) 

4 severity ( seriousness rating or index) 

5 event timing (e.g, days without recidivism; time to first offense) 

6 proportion or amount of time in custody, under supervision, etc. 

7 rating of amount of delinquency, severity, change, etc. (e.g, therapist rating of 
extent delinquent behavior improved) 

8 more than one of above elements combined in composite measure 

9 other: 

10 cannot tell 

Source of delinquency data [D4] (check best one) : 

Self report 

1 paper & pencil 

2 personal interview 

3 telephone interview 

4 other: 

5 cannot tell 

Other reports 

06 family 

07 peers 

08 teacher! s) 

0 9 therapist/ service provider 

10 other: 

11 cannot tell 

Records 

12 school 

13 police 

14 probation 

15 court 

16 custodial institution 

17 regional crime statistics 

18 other: 

19 cannot tdl 

20 any other: 

2 1 cannot tell which of above categories 
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Properties of this measure demonstrated, reported, or dted (check all that apply) : 


Properties demonstrated, validity; [DNl] 

Properties demonstrated, reliability; [DN2] 

Reliability coefficient; [DN2R] magnitude of coefficient, if given (-99 if missing) 

Properties demonstrated, sensitivity; 

[DNS] sensitivity/ responsiveness/ discriminant ability [i.e., indication that measure 
capable of responding to treatment effect] 

Properties demonstrated, none; [DN4] none of above 

Treatment-test overlap; [DNS] Rate the extent to vdiich the treatment content 
overlaps or resembles the content of this measure, e.g., as in "teaching the test. " At 
one end of the continuum are measures that are virtual duplicates of the treatment, 
e.g., a behavioral treatment that reinforces a specific list of behaviors and an 
outcome measure that counts how often those same behaviors are performed. At the 
other end of the continuum are measures that have virtually no content similarity to 
the treatment, e.g., a treatment of insight- oriented counseling about family relations 
and an outcome measure of math grades in school. This is not a question about the 
extent to which the treatment caused the dependent variable. The question concerns 
the content of the treatment not the plausibility of the hypothesized causal 
relationship. The topic area of the treatment in relation to the topic area of the 
measure determines the general category. Use the 1-3 range for treatments and 
measures of generally different content and involving different activities; use 3-5 for 
those situations like general counseling and delinquency measures vdiere discussion 
of delinquency may well have been part of the treatment content, giving topic 
overlap, but the activities of treatment (talking about delinquency) are different 
from those in the measure (committing delinquency) . Use the 5- 7 range for fairly 
clear overlap in both topic area and activity, e.g. substance abuse treatment 
involving role playing resistance to peer pressure and actual substance abuse 
incidents as an outcome measure. Within these ranges, adjust for the degree of 
overlap according to the specifies of the individual case. 

Rate this measure for treatment- test content overlap: 

Very Low 1 2 3 4 5 6 7 Very 

High 

Overlap 

Overlap 

Social desirability bias; [DN6] Rate the extent to vdiich this measure seems 
susceptible to a social desirability response bias, that is, the extent to which the 
respondents are (a) able to recognize what response 'looks good," (b) may be 
motivated to 'look good," and (a) are able to exaggerate the response in the direction 
of 'looking good." Note that you are not to rate how much social desirability bias you 
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think actually occurred, only how susceptible you think the measure might be. At 
one end of the continuum would be measures based on objective procedures 
administered by impartial others, e.g., random surprise urinalysis for drug testing. 
At the other end of the continuum would be the juvenile's own reports made to 
someone with authority over him (e.g, probation officer) on sensitive issues (e.g., 
drug use) in open-ended fashion without expectation of verification. This is a 
demand characteristics issue, his combines format or structure of the measure, 
demand characteristics of the situation in which the measure is taken, and the ego 
involvement of the provider of the measure. This is not a measure of the extent to 
which one's behavior is changeable but the changeability of the report of that 
behavior. Objective measures should rate in the 1-3 range with arrest records for 
violent crimes=l and those for status offenses =2. Self-report or a rating by those 
who are ego involved in some way would be in the 6- 7 range. In descending order of 
ego involvement are: the target juveniles, parents, therapists, teachers, non- blind 
researchers, CJ personnel. In descending order of response format sensitivity to bias 
are: self-report, rating, objective count, and independent cross-checking or review. 

Rate this measure's potential for social desirability response bias: 

Very Low 1 2 3 4 5 6 7 Very 

High 

Potential 

Potential 


Confidence in above 2 ratings: [DN7] 


Very Low Low Moderate High Very High 
1 2 3 4 5 

(Guess) (Informed (Weak (Strong (Explicitly 

Guess) Inference) Inference) Stated) 


NonDelinquency Variables 

Code a separate screen for each nondelinguency outcome measure for which the 
aggregate treatment and comparison groups can be compared on the first wave of 
post- treatment outcome. (Subseguent waves and breakouts for this aggregate 
comparison are coded on separate attachments to be appended to this sheet) . 
Nondelinguency outcome measures are all those that remain after any delinguency 
outcome measures are coded on the "delinguency behavior outcome measures 
codesheet" according to the definitions on that codesheet. 

Type of construct represented: [Nl] Construct represented by this measure 
(check best one) : [Note: Some categories, like "attitudes" occur in various sets below. 
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Approach this item by first identifying the most appropriate molar category, e.g, 
psychological adjustment, interpersonal, etc., then finding the best item within that 
category for the particular measure at issue.] 

Psychological adjustment 

1 attitudes re delinguency, personal conduct, police, etc. 

2 self-esteem, self concept 

3 other personality trait 

4 behavioral problems checklist, etc. 

5 knovdedge re drugs, ethics, moral dilemmas, law, etc. 

6 mood, anxiety, depression, emotionality, etc. 

7 other: 

Interpersonal adjustment 

8 attitudes re interpersonal issues, family, peers, etc. 

9 family functioning, communication, household chores, etc. 

10 peer relations, etc. 

11 social skills 

12 other: 

community adjustment 

13 attitudes re community, citizenship, etc. 

14 perceptions by merchants, community officials etc. 

15 other: 

School adjustment 

16 attitudes re school, teachers, etc. 

17 noncriminal/ non- delinguent disciplinary 

18 attendance; tardiness 

19 dropping out; graduating 

20 other: 

Academic improvement 

2 1 achievement (content mastery in topic area) 

22 grades 

23 cognitive, general (e.g. IQ) 

24 other: 

Vocational adjustment 

25 attitudes toward work, employment, careers, etc. 

26 J ob attendance, tardiness 

2 7 employment status ( gets/ keeps j ob) 

28 employment learning (job content, skills) 

29 vocational learning (job finding, interview, skills, simulations) 

30 other: 
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Adjustment to treatment 

31 attitudes re treatment, therapist, program, etc. 

32 attendance, participation in treatment 

33 treatment progress, e.g., rating 

34 status at termination of treatment 

35 post- treatment prognosis 

36 other: 

Institutional adjustment 

37 attitudes re institution, staff, etc. 

38 program behavior, general 

39 rule compliance ( non criminal) 

40 getting along with staff, peers 

41 post release prognosis 

42 other: 

43 ^obal adjustment/ improvement; individualized criteria (e.g., global rating) 

44 all other: 

Confidence in construct; [N2] Confidence in identification of construct 
represented by measure: 


Very Low Low 

Moderate 

High 

Very High 

1 2 

3 

4 

5 

(Guess) (Informed 

(Weak 

(Strong 

(Explicitly 

Guess) 

Inference) 

Inference) 

Stated) 
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Type of measure [N3] (check best one): 

1 psychometric/, standardized, multi- item (e.g., achievement, attitude, 
personality, MMPI) 

2 criterion referenced or goal setting; mastery; behavioral objectives- test, form, 
or guestionnaire 

3 behavioral observation; behavioral report; behavioral record or charts 

4 survey type items, guestionnaire, self report form 

5 j udgment ratings; j udgment coding from observation by other( s) 

6 archival report (e.g, school, agency records) 

7 projective test (e.g., TAT, Rorschach) 

8 other: 

9 cannot tell 

Origin of measure [N4] (check best one) : 

1 "off the shelf" named measure or scale 

2 taken intact from other research, not in general use 

3 adapted or modified from other source 

4 pre-existing records or archives 

5 new instrument apparently developed for this evaluation 

6 other: 

7 cannot tell 

Source of information; [N5] Primary source of information for measure (check 
best one) : [Note: Issue here is who is forming the content recorded in the measure. 
E.g., if a person fills cut a form or responds to an interview, that person is the 
information source. If an observer rates or judges another person, however, it is the 
observer not the person observed, vdio is the source.] 

1 juveniles themselves (e.g, self report, survey) 

2 front line service provider; therapist; caseworker 

3 program manager, administrator, agency staff, etc. (not front line) 

4 researchers acting directly as observers, raters, etc. 

5 other observers or participants (e.g., client families, employers) 

6 records, archives 

7 other: 

8 cannot tell 

Properties of this measure demonstrated, reported, or dted (check all that apply) : 

Properties demonstrated, validity; [DNl] 

Properties demonstrated, reliability; [DN2] 

Reliability coefficient; [DN2R] magnitude of coefficient, if given (-99 if missing) 

Properties demonstrated, sensitivity; [DN3] 

sensitivity/ responsiveness/ discriminant ability [i.e., indication that measure capable 
of responding to treatment effect] 
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Properties demonstrated, none; [DN4] none of the above. 


Treatment-test overlap; [DNS] Rate the extent to which the treatment content 
overlaps or resembles the content of this measure, e.g., as in "teaching the test. " At 
one end of the continuum are measures that are virtual duplicates of the treatment, 
e.g., a behavioral treatment that reinforces a specific list of behaviors and an 
outcome measure that counts how often those same behaviors are performed. At the 
other end of the continuum are measures that have virtually no content similarity to 
the treatment, e.g, a treatment of insight- oriented coimseling about family relations 
and an outcome measure of math grades in school. This is not a guestion about the 
extent to which the treatment caused the dependent variable. The guestion concerns 
the content of the treatment not the plausibility of the hypothesized causal 
relationship. The topic area of the treatment in relation to the topic area of the 
measure determines the general category. Use the T 3 range for treatments and 
measures of generally different content and involving different activities; use 3- 5 for 
those situations like general counseling and delinguency measures vdiere discussion 
of delinguency may well have been part of the treatment content, giving topic 
overlap, but the activities of treatment (talking about delinguency) are different 
from those in the measure (committing delinguency) . Use the 5- 7 range for fairly 
clear overlap in both topic area and activity, e.g. substance abuse treatment 
involving role playing resistance to peer pressure and actual substance abuse 
incidents as an outcome measure. Within these ranges, adjust for the degree of 
overlap according to the specifies 
of the individual case. 

Rate this measure for treatment-test content overlap: [DNS] 

Very Low 1 2 3 4 5 6 7 Very 

High 

Overlap 

Overlap 

Social desirability bias; [DN6] Rate the extent to which this measure seems 
susceptible to a social desirability response bias, that is, the extent to which the 
respondents are (a) able to recognize what response 'looks good," (b) may be 
motivated to 'look good," and (a) are able to exaggerate the response in the direction 
of 'looking good." Note that you are not to rate how much social desirability bias you 
think actually occurred, only how susceptible you think the measure might be. At 
one end of the continuum would be measures based on objective procedures 
administered by impartial others, e.g., random surprise urinalysis for drug testing. 
At the other end of the continuum would be the juvenile's own reports made to 
someone with authority over him (e.g, probation officer) on sensitive issues (e.g., 
drug use) in open-ended fashion without expectation of verification. 'This is a 
demand characteristics issue, his conrbines format or structure of the measure. 
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demand characteristics of the situation in which the measure is taken, and the ego 
involvement of the provider of the measure. This is not a measure of the extent to 
which one's behavior is changeable but the changeability of the report of that 
behavior. Objective measures should rate in the 1-3 range with arrest records for 
violent crimes=l and those for status offenses =2. Self-report or a rating by those 
who are ego involved in some way would be in the 6- 7 range. In descending order of 
ego involvement are: the target juveniles, parents, therapists, teachers, non- blind 
researchers, CJ personnel. In descending order of response format sensitivity to bias 
are: self-report, rating, objective count, and independent cross-checking or review. 

Rate this measure's potential for social desirability response bias: 

Very Low 1 2 3 4 5 6 7 Very 

High 

Potential 

Potential 

Confidence in above 2 ratings: [DN7] 


Very Low Low Moderate High Very High 

1 2 3 4 5 

(Guess) (Informed (Weak (Strong (Explicitly 

Guess) Inference) Inference) Stated) 


Effect Size Calculation (ES) 


Weeks Delinquency Counted [ES20] (leave blank if nondelinquency variable) 

I I I I Approximate (or exact) time period covered by delinciuency measure, 

i.e., period over which coimted delinciuency occurs, e.g., whether arrested during last 
six months. (Code number of weeks, roimded to nearest whole number; divide days 
by 7 and round; multiply months by 4.3 and round; code 999 if cannot tell or NA, 
but try to make an estimate if possible. Code 888 if total prior history covered) . 

Weeks Post-Treatment Measured [Timel] 

I I I I Approximate (or exact) weeks after end of treatment when measure 

taken, i.e., what was the interval from the end of the treatment to the time when this 
outcome measure was taken. (Code vdiole number, no decimals; divide days by 7 
and roimd to whole number; multiply months by 4.3 and roimd; code 999 if cannot 
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tell, but try to make an estimate if possible) . [NOTE: If measure was taken more or 
less immediately at the end of treatment, code this as one week] 


Effect Size Statistics 

[Note: Complete as much of this item as possible even if it requires some calculation 
or manipulation of data presented in the report. Use separate treatment vs. control 
group statistics if available, otherwise statistics for pooled groups if they are 
available. If neither available, enter missing data codes.] 


Original N 

Number of subjects originally assigned/ selected for the treatment and control 
groups before any attrition, dropouts, refusals to participate, etc. (missing^999). 
[Note: The issue here is attrition between assignment/ selection for treatment and 
measurement. If attrition data after pretest and after group assignment conflict, 
code the latter. The three common ways to get information on the original group size 
are from assignment to treatment groups, the actual pretest data for measures (if 
there are differences in n between the various pretests, use the largest one) and 
demographics at pretest. The largest number claimed for each group by any of these 
sources should be considered the n at assignment.] 
treatment group [ES36] 
control group [ESS 7] 
total/ difference [ES38] 

effect size total N if treatment or control N's not known [ ESS by 

hand] 

Effect Size N: Number of subjects whose data is actually represented in the statistics 
for the outcome on which the effect size calculation is based (missing=9999). 
treatment group [ESI] 
control group [ES2] 

total/ difference [ESS] 


Effect size total N if treatment and control group Ns not known [ESS] 
Mean on measure (missing=999.99) 
treatment group [ES9] 
control group [ESIO] 
total/ difference [ESll] 


Variance on measure (missing^99.99) 
treatment group [ES12] 
control group [ES13] 
total/ difference [ES14] 
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SD (standard deviation) [ES25] [ES26] 

SE (standard error) [ES27] [ES28] 

Proportion successful [ES29] [ESSO] 

N successful [ES31] [ES32] 

Enter here the raw values for "N Successful" if they are provided. Do not calculate "N 
successful" from the effect size N and the proportion. Only enter N successful if it is 
given explicitly. 

t-value [ES33] 

F-value (df=l) [ES34] 

Chi-square (df=l) [ES35] 

Enter values as appropriate and available. Note: if you have, or can determine, the 
proportion or frequency who "failed" or "succeeded" be sure to enter that 
information. 

Effect size (by FileMaker or by hand) 

I I |.| I I ES (two decimals with an algebraic sign in front, plus if favors 

treatment (i.e., more "success" for treatment group than control), minus if favors 
control, +9.99 ifNA. 

Pre-test, Post-test, or Follow-up [ES24] 

Identify the type of effect size in terms of the time of measurement of the data on 
which the treatment vs. control comparison represented in the effect size is made. 
[NOTE: Code the available information for any dependent variable for which the 
direction of the difference can be determined (c^bether favors treatment, control, or 
neither) even if a numerical effect size value cannot be determined.] 

"Pretest" refers to measures of status before treatment or at the beginning of 
treatment on the same variable used as an outcome measure. E.g., delinciuency 
index for an interval prior to treatment is the "pretest" for the delincjuency index for 
the same length interval subsecjnent to treatment. 

"Posttest" refers to measures of status on first wave of measurement after the 
treatment is completed. 

Tollow-up" refers to measures of status at any wave of measurement after the 
posttest, i.e., for there to be a follow-up, there must be at least two waves of 
measurement after treatment is completed; the first would be the posttest, the 
second (and any others thereafter) would be a followup. 

Type of means [ES15] [Note: If ES based on proportion or N successful, code as 
proportion mean.] 

1 arithmetic mean of scores 

2 median of scores 

3 proportion or rate 

4 other: 

5 cannot tell 
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Type of variances [ES16] 

[Note: If ES based on proportion or N successful, code as proportion variance.] 

1 standard deviation 

2 variance 

3 standard error 

4 proportion 

5 other: 

6 cannot tell 


Direction of Difference [ES17] 

Numerically comparing treatment group scores to control group scores on this 
measure, the raw treatment vs. control group difference favors (i.e., shows more 
"success" for) which group (check best one). [Note: Report this information if 
available even if the numerical values on the variables are not reported.] 

1 treatment 

2 control 

3 neither (exactly egual) 

4 cannot tell or statistically insignificant report only 


Type of Statistical Test for T-C difference [ES18] 

1 no test done 

2 kind of test not reported 

3 t, E, Z, or r (parametric, no partialling or variance adjustment) 

4 Chi- sguare test 

5 other nonparametric test, e. g. , Mann-Whitney U 

6 test adjusts for covariate, not pretest (e.g, ANCOVA, covariate blocking) 

7 test adjusts for PRETEST (e.g, ANCOVA with pretest covariate, repeated 
measures design, t-test using gain scores) 

8 other 

9 missing 


Statistical Significance Difference [ES19] 

[Note: report what the author claims at whatever alpha level, etc. used; if only p- 
values provided with no statement of what is judged statistically significant, code 
anythingwith p<05 as significant.] 

1 significant 

2 not significant 

3 not reported 
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Effect Size Confidence [ES22] (Confidence in effect size value) 


Highly Moderately Some Slight No 

EstimatedEstimation Estimation Estimation Estimation 

1 2 3 4 5 


[Note: Confidence guidelines: 

5 No Estimation- have descriptive data; can calculate ES directly. 

4 Slight Estimation- - significance testing statistics rather than descriptive 

statistics, but have complete stat conventional sort. 

3 Some Estimation- - have unconventional statistics and must convert to 

eguivalent t- values or have conventional statistics but incomplete, e.g., exact p 
level only. 

2 Moderate Estimation- have complex but relatively complete stats, e.g., 

multiple regression, LISREL, multifactor ANOVA etc. as basis for estimation. 

1 Highly Estimated- have N and crude p value only, e.g., p< 10, and must 

reconstruct 
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Page Number Where ES found: 
Report in which ES found: 
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Appendix 2: DuBois et al. Coding 


id 


3 4 5 6 

aut 


9 10 11 12 

jour 


yr 


1 


7 8 


13 14 


15 16 


Study ID 

Report Identification 

Title 


^uthor(s): (enter first six letters of first author) 


loumal (Enter abbreviation of journal title, e.g., JCCP) or ED # 
Year: 

C.17=Blank 


pub 


sou 


prgid 


intgrp 


nintgrp 


Publication Vehicle: 

Journal 

Dissertation 

Book 

Thesis 

Paper presentation 
Govn=t report 
Private evaluation 

Source: 

PsychINFO 

ERIC 

Medline 

Dissertation Abstracts 

Other Data Base (specify ) 

Ancestry (specify Study ID# ) 

Research Known to First Author 


21 


C.20=Blank 

Mentoring Program Information and Study Design 

Program ID 

Nature of Intervention Group 
Mentoring alone 

Mentoring and other intervention (specify 

) 

Nature of Comparison Group 

Did not receive an intervention 
Received mentoring 

Received intervention other than mentoring (specify 

) 
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setting 


prgloc 


type 


instype 


comp 


mtcrit 


mtgen 

mtrac 

mtint 

mtset 

mtoth 


scm 


bkchk 


int 


25 


26 


27 


28 


29 


30 


31 

32 

~34 


36 


37 


38 


Setting Where Mentoring Activities Occurred 

Community 

School 

Workplace 

4. Institution/ Agency/Organization (other than school) 

5. Other (specify ) 

Unspecified 

Location of Program (City Size) 

Large Urban 
Small Urban 
Suburban 
Rural 
Mixed 

Program Type 

1 . Instrumental (specify ) 

Psychosocial 

Combination 

Other (specify ) 

Unspecified 

Type of Instrumental Focus (if applicable) 

Educational 

Employment 

Other (specify ) 

Mentor Compensation 

Educational (course credit, class assignment, etc.) 

Einancial 

Other (specify ) 

None/V olunteer 
Unspecified 

Mentor/Mentee Match Criteria? 

0 = No 

1 = Yes 

2 = Unspecified 

Mentor/Mentee Match Criteria (if applicable; 0=No, l=Yes) 

Gender 

Race/Ethnicity 

Interests 

Setting 

Other (specify ) 


Mentor Screening? 

0 = No 

1 = Yes 

2 = Unspecified 

Mentor Screening Criteria (if applicable; 0=No, l=Yes) 

Background checks (criminal records, references, etc.) 
In-person interview 

Home visit 


hmvst 


Psychological testing 
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psytst 

40 

Other ('specify ) 

scroth 

41 

Mentor Training Prior to Match? 

train 

42 

0 = No 

1 = Yes 

2 = Unspecified 



Amount of Training (if applicable; code in hours, considering 1 session^ 

amttm 

43 44 

2 hours if only this information is available; round to nearest whole #) 



Characteristics of Training (if applicable; 0=No, l=Yes) 



Instructor-Led 

instm 

45 

Prepared Materials Used (e.g., video, workbook, etc.) 

mattm 

46 

Individual 

indtm 

47 

Group 

grptrn 

48 

Self-Study 

ssttm 

49 

Unspecified 

unstm 

50 

Mentor Supervision? 

super 

51 

0=No 

l=Yes 

2=Unspecified 



Frequency of Supervision (if applicable; code # hours per month. 

frqsup 

52 53 

considering 1 meeting = 1 hour and rounding to whole#s) 
Type of Supervisory Contacts (if applicable) 

typsup 

54 

In-Person 

Telephone 

Mail 

Mixed 

Unspecified 



Ongoing Mentor Training? 

ongtm 

55 

0=No 

l=Yes 

2=Unspecified 



Amount of Ongoing Training (if applicable; code # hours per month. 

amtont 

56 57 

considering 1 sessional. 5 hours if only this information is available; 
nearest whole#) 



Mentor/Mentee Contact Time Expectations/Guidelines? 

mmcnt 

58 

0=No 

l=Yes 

2=Unspecified 



Expected Erequency Mentor/Mentee Contact (# hours/week. 

expcnt 

59 60 

considering each visit to be 2 hours if information is only provided 
in this form and rounding to whole #s) 

Was actual frequency of mentor/mentee contact measured? 


61 

0=No 
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l=Yes 




Actual Average Frequency of Mentor/Mentee Contact (# hours/week 

actcnt 

62 

63 

considering each visit to be 1.5 hours if information is only provided in 




this form and rounding to whole #s) 




Mentor/Mentee Length of Relationship Expectations/Guidelines? 

mmlng 


64 

0=No 




l=Yes 




2=Unspecified 




Expected Length of Mentor/Mentee Relationship (# of months, 

explng 

65 

66 

considering 4 weeks=l month if information only provided in this form, 




rounding to nearest whole month) 




Was actual length of mentor/mentee relationships measured? 



67 

0=No 




l=Yes 




Actual Average Length of Mentor/Mentee Relationship (# of months. 

acting 

68 

69 

considering 4 weeks=l month if information only provided in this form. 




rounding to nearest whole month) 




END LE4E#1/BEGE4 LINE#2 (Repeat Study ID C.l-C.2;C.3=Blank) 




What was the average age of the mentors? (round to nearest whole#) 

menage 

4 

5 





Developmental Level of Mentors 

mendev 


6 

Adolescence (12-18 years of age) 




Early Adulthood (19-29 years of age) 




Middle Adulthood (30-54 years of age) 




Late Adulthood (55 and older) 




Mixed (adult only) 




Mixed (adolescent and adult) 




Unspecified 




Gender of Mentors (percentage male, rounding to whole #) 

mengen 

7 

8 





Race/Ethnicity of Mentors (percentages, rounding to whole #s) 




White/Caucasian 

menwh 

t 

9 

10 

Black/African-American 

menblk 

11 

12 

Native American 

mennta 

13 

14 

Asian- American 

menasi 

15 

16 

Hispanic 

menhis 

17 

18 





Other 

menotr 

19 

20 

int(l)/ext(2)? all (ly/On) n-mexc.? mon.? par? stmcact.? mnsupgrp? 




Educational/Professional Level and Background of Mentors 

edprlv 


22 

Less than High School Diploma 




High School Diploma or GED 




Some College 




Undergraduate Degree 




Graduate Degree 




Mixed 




Unspecified 




Background in Helping Profession/Role? 


143 


The Campbell Collaboration | www.campbellcollaboratlon.org 



helbkg 




23 

Yes, Occupation/Education 
Yes, Parent/Caretaker 
No 

Mixed 

Unspecified 






Is there a control group? 

Ctrl? 




24 

0=No 






l=Yes 






Type of Control: 

ctrltype 




25 

Pretest 

2. Random Assignment 
Static Group 






Is there a pretest? 

pre? 




26 

0=No 






l=Yes 






If there was a pretest, what type is it? 

pretype 




27 

Identical to post test 
Functionally the same as post test 






If there was a pretest, to whom was it given? 

pregq? 




28 

Given to intervention group only 
2. Given to both intervention and control groups 






If there was a static group were there non-statistical procedures for 
creating equality? 

statcov 




29 

0=No 






l=Yes 






If the procedure was matching, what variables were used to match (0=No, 
l=Yes)7 






Sex 

sexcov 




30 

Race/Ethnicity 

raccov 




31 

Age/Grade Eevel 

agecov 




32 

School Attended 

schcov 




33 

Family Structure 

famsco 

V 




34 

Family Income Fevel/SES 

famico 

V 




35 

Achievement Eevel 

achcov 




36 

Emotional/Behavioral Adjustment Eevel 

embcov 




37 







END LINE#2/BEGD4 LINE#3 (Repeat Study ID C.l-C.2;C.3=Blank) 

Independent Sample and Deletion Codes 

Independent Sample Code 

is 

4 

5 

6 

7 

Deletion Code (0=Don=t Delete; l=DeleteBmulticomponent program; 

deled 




8 

2=DeleteBcomparison within intervention group; 3=Delete-secondary 
suhgrouping of sample) 






C.9=Blank 

Participant Characteristics 

# of Females 
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nfem 

10 

11 

12 

13 

# of Males 

nmal 

14 

15 

16 

17 

# of Both Males and Females (enter only if separate #s for Males and 

nboth 

18 

19 

20 

21 

Females not available) 






Average Age (in years, at start of program, rounded to nearest whole #) 

ythage 



22 

23 

Developmental Level 

ythdev 




24 

Early Childhood (5-8) 
Middle/Late Childhood (9-11) 
Early Adolescence (12-14) 
Middle/Eate Adolescence (15-18) 
Mixed 
Unspecified 






SES 

ythses 




25 

Eow 

2. Middle 
High 
1 . Mixed 
Unspecified 






Race/Ethnicity (percentages, rounding to whole #s) 






White/Caucasian 

ythwht 



26 

27 

Black/African-American 

ythblk 



28 

29 

Native American 

ythnta 



30 

31 

Asian- American 

ythasi 



32 

33 

Hispanic 

ythhis 



34 

35 

Other 

ythotr 



36 

37 

At-Risk Status 

risk 




38 

Environmental Eactors (e.g., single-parent home) 
Individual Eactors (e.g., academic difficulty) 

Both Environmental and Individual Eactors 
Neither 

Unspecified AESO SINGPAR(<=75%) [lY/ONl: 






END LINE#3/BEGIN LINE#4 (Repeat Study ID C.l-C.2;C.3=Blank) 
Independent Sample ID 

Outcome Variable Information 

Criterion Measure 

crit 



4 

5 

Self-Esteem/Self-Concept 

Perceived Self-Efficacy/Sense of Mastery 

Classroom Behavior 

Report Card Grades 

School Attendance 

Academic Achievement Test Scores 

Academic Self-Concept/Self-Esteem 

Attitudes Toward School 
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School Drop-Out 

Intelligence/Cognitive Skills and Abilities 
Substance Use 

12. Substance Use Attitudes/Knowledge 

Problem/High-Risk Behavior (other than Substance Use) 

Psychological/Emotional Distress 

Psychological/Emotional W ell-Being 

Peer Relationships 

Eamily Relationships 

Social/Cultural Activity 

Social Skills/Social Competence 

Coping Behavior 

Community Service 

Other (specify 

)ata Source 
1. Youth 
Parent 
1. Teacher 
Mentor 

Administrative Records 
Other (specifiy 


ime of Data Collection for Post-Test Relative to End of Program 
During mentor relationship 
Immediate post-test 
Eollow-up 


If follow-up assessment, what was length of interim period from the end 
of the program? (specify in weeks, rounded to nearest whole #) 


intpre 

12 

13 

14 

15 

16 

intsdpr 


17 

18 

19 

20 

intpst 

21 

22 

23 

24 

25 

intsdpst 


26 

27 

28 

29 

nint 


30 

31 

32 

33 

cntpre 

35 

36 

37 

38 

39 


C.ll^Blank 

Statistical Outcomes 
Group 1 (Intervention) 

0 (pre-test) 

SD 

0 (post -test) 

SD 

n 

C.34=Blank 

Group 2 (Control) 

0 (pre-test) 
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cntsdpr 

40 

41 

42 

43 

0 (post-test) 

cntpst 44 

45 

46 

47 

48 

SD 

cntsdps 

49 

50 

51 

52 

n 

ncnt 

53 

54 

55 

56 

C.57=Blank 






Type of d 

dtype 




58 

Post - pre 

PoStint - pOStcnt 






How was the d index derived? 

dder 




59 

1 Unadjusted means comparison 
2ANCOVA/MR w/pre-test only as covariate 
3ANCOVA/MR w/only control measure[s] other than pretest 
4ANCOVA/MR w/both pre-test and other measures as controls 
SBlocking Design 






Direction of Effect Size 

ddir 




60 

Positive 

Negative 






Significance of Finding (p < .05 two-tailed; l=yes; 2=no; 3=not available) 

dsig 




61 


d 

62 

63 

64 

65 

d index or 
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Appendix 3: Tolan et al. Additional Codes for Mentoring Meta- 
analysis 

1. Differentiate Risk into: 

1. Behavioral (aggression, delinquency level, etc.) 

2. Environmental-Individual Differences such as Family, 

School Achievement 

3. Hi-Risk Setting such as Violent Community 


2. Mentoring Activities Included (yes/no) 

1. Emotional Support 

2. Teaching/Information Provision 

3. Advocacy 

4. Modeling 

5. Acting as Identification Figure 


3. Nature of Relationship/Basis of Mentoring 

1. Survivor (had same issues) 

2. Civic Duty- as part of job or otherwise volunteerto help 
those in need 

3. P rofessional Development or Duty 

4. Other 


4. Implementation Quality 

1. Checked or not 


2. Fidelity or Application of Key Principles Checked 

3. Retention of Participants in Program (percent) 

4. Retention of Participants in Study (percent) 

5. Evidence that Mentors Retained, yes/no, percent 
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